Open source · Apache-2.0 · Built on Pydantic AI

Agents you can run in production.

Firefly Agentic is the production-grade metaframework built on Pydantic AI. Keep its Agent, Tool and RunContext — gain lifecycle hooks, delegation, memory, reasoning patterns, validation loops, RAG and DAG pipelines. Protocol-driven, swappable, no lock-in.

$ curl -fsSL https://raw.githubusercontent.com/fireflyframework/fireflyframework-agentic/main/install.sh | bash

Get started Read the tutorial GitHub

Built on Pydantic AI 28 protocols 6 reasoning patterns 8 embedding providers 6 vector stores Python 3.13+ type-checked Apache-2.0

Why a metaframework

Keep the engine.
Gain the architecture.

Pydantic AI gives you type-safe, model-agnostic agents. A production system needs more: orchestration, validation-and-retry, memory across turns, traces and budgets, experiments. Firefly Agentic supplies every one of those as a dedicated, swappable layer — so your business logic never couples to infrastructure.

Your Agent, Tool and RunContext APIs stay exactly as they are.
Lifecycle hooks, registries, delegation, memory, reasoning, validation and pipelines wrap them — all optional, all composable.
Switch models, swap a memory backend or replace a component without touching agent code. No vendor lock-in.

Anatomy of an agent run: a FireflyAgent wrapping pydantic_ai.Agent inside a ten-stage middleware chain, with delegation, fallback, caching and memory.

One framework, every concern

Everything a production agent needs

Twelve composable layers. Adopt one, or all of them.

Agents

FireflyAgent, registry, a 10-stage middleware chain, and seven delegation strategies.

Tools

Protocol or base-class tools, guards, composition, and human-in-the-loop approval.

Reasoning

ReAct, CoT, Plan-and-Execute, Reflexion, Tree of Thoughts, Goal Decomposition.

Memory

Conversation history and working memory with token budgets and pluggable stores.

Validation & QoS

Parse-then-validate retries, LLM-as-judge rubrics, hallucination guards.

Pipelines

Typed DAGs with parallel execution, retries, checkpointing and audit logs.

Workflows

A code-defined orchestration DSL over agents with budgets and deterministic resume.

RAG

Eight embedding providers, six vector stores, auto-embedding and tenant scoping.

Observability

Native OpenTelemetry spans & metrics, cost resolution, and budget gates.

Security

Prompt-injection guards, PII/secret redaction, and at-rest encryption.

Explainability

Decision records, audit trails and human-readable reports for compliance.

Resilience

A circuit breaker that short-circuits a failing model before it drains budget.

Architecture

Five layers, strict top-down flow

Core → Agent → Intelligence → Experimentation → Orchestration. Higher layers depend on lower ones, never the reverse.

Firefly Agentic architecture: one front door over five layers — Orchestration, Experimentation, Intelligence, Agent, Core — on the Pydantic AI engine.

Every extension point is a @runtime_checkable protocol — implement it and the framework discovers your component by duck typing.

The runtime-checkable protocols — AgentLike, ToolProtocol, GuardProtocol, ReasoningPattern, StepExecutor, DelegationStrategy, CompressionStrategy, MemoryStore, ValidationRule, Chunker, EmbeddingProtocol, VectorStoreProtocol — each with its swappable implementations.

Five minutes in

Familiar Python, all the way down

Async-first, typed, and decorator-friendly. Here's the shape of it.

Define an agent Add memory Reason Build a pipeline

assistant.py

from fireflyframework_agentic.agents import FireflyAgent

agent = FireflyAgent(name="assistant", model="openai:gpt-4o")
result = await agent.run("Summarize this contract in 3 bullets.")
print(result.output)

from fireflyframework_agentic.memory import MemoryManager

memory = MemoryManager(max_conversation_tokens=32_000)
agent = FireflyAgent(name="bot", model="openai:gpt-4o", memory=memory)

cid = memory.new_conversation()
await agent.run("My name is Alice.", conversation_id=cid)
await agent.run("What's my name?", conversation_id=cid)  # -> Alice

from fireflyframework_agentic.reasoning import ReActPattern

react = ReActPattern(max_steps=5)
result = await react.execute(agent, "What's the weather in London?")
print(result.output)  # structured ReasoningResult + trace

from fireflyframework_agentic.pipeline.builder import PipelineBuilder
from fireflyframework_agentic.pipeline.steps import AgentStep, CallableStep

pipeline = (
    PipelineBuilder("idp")
    .add_node("classify", AgentStep(classifier))
    .add_node("extract", AgentStep(extractor))
    .add_node("validate", CallableStep(check))
    .chain("classify", "extract", "validate")
    .build()
)
result = await pipeline.run(inputs="Process this document")

Intelligence layer

Six reasoning patterns, one loop

Every pattern shares a reason → act → observe → continue template, produces a structured trace, and swaps its output mode per model.

› ReAct › Chain of Thought › Plan-and-Execute › Reflexion › Tree of Thoughts › Goal Decomposition

Orchestration layer

Declarative DAGs & code-defined workflows

A typed pipeline graph for structure; a Python DSL for deterministic, resumable orchestration. Use either, or both.

A typed DAG pipeline: a seven-phase IDP flow with fan-out/fan-in, a human-in-the-loop pause, checkpointing and an audit log. — Pipeline — declarative DAG

Dynamic Workflows: the @workflow DSL primitives over a WorkflowContext carrying runner, journal, budget and routing, with verification helpers. — Workflows — code-defined DSL

Retrieval-augmented generation

Bring your own embeddings & vector store

Eight embedding providers and six vector-store backends behind one protocol — drop straight into a pipeline with EmbeddingStep and RetrievalStep.

Retrieval-augmented generation: eight embedding providers and six vector-store backends behind the EmbeddingProtocol and VectorStoreProtocol.

Learn the framework

From zero to a working IDP pipeline

The Bible

Agents you can run in production.

Keep the engine.
Gain the architecture.

Everything a production agent needs

Agents

Tools

Reasoning

Memory

Validation & QoS

Pipelines

Workflows

RAG

Observability

Security

Explainability

Resilience

Five layers, strict top-down flow

Familiar Python, all the way down

Six reasoning patterns, one loop

Declarative DAGs & code-defined workflows

Bring your own embeddings & vector store

From zero to a working IDP pipeline

The Complete Tutorial

IDP Pipeline

Module Reference

Agents you can run in production.

Keep the engine.Gain the architecture.

Everything a production agent needs

Agents

Tools

Reasoning

Memory

Validation & QoS

Pipelines

Workflows

RAG

Observability

Security

Explainability

Resilience

Five layers, strict top-down flow

Familiar Python, all the way down

Six reasoning patterns, one loop

Declarative DAGs & code-defined workflows

Bring your own embeddings & vector store

From zero to a working IDP pipeline

The Complete Tutorial

IDP Pipeline

Module Reference

Keep the engine.
Gain the architecture.