Orchestration Patterns & Topologies

By Pritesh Yadav June 16, 2026 19 min read —

A field guide to the standard ways LLM agents and workflows are wired together — what each pattern is, when to reach for it, and how it fails.

Research/reference doc · 2026-06-16 · part of the Agent Orchestration suite

0. Framing: workflows vs. agents (read this first)

The single most useful distinction in the literature, from Anthropic’s Building Effective Agents:

Workflows — “systems where LLMs and tools are orchestrated through predefined code paths.” Deterministic, reliable, built for well-defined tasks.
Agents — “systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.” Flexible, model-driven, built for open-ended tasks.

Every pattern below sits somewhere on a single master axis running from fully code-orchestrated (deterministic, top of the list) to fully LLM-orchestrated (autonomous, bottom). OpenAI frames the identical axis as orchestrate via code (deterministic control flow; you control speed/cost/quality) vs. orchestrate via LLM (the agent plans its own next steps; best for open-ended work but needs heavy prompt/eval investment).

Anthropic’s overarching guidance — keep this in front of you the whole time:

Start with the simplest optimized prompt; add complexity (chaining → workflows → agents) only when simpler approaches demonstrably underperform.
Three principles for agents: maintain simplicity, prioritize transparency (surface the planning steps), carefully craft the agent–computer interface (ACI) through good tool docs and testing.
Agents carry higher cost and the potential for compounding errors — a bad early step poisons everything downstream.

DETERMINISTIC / code-orchestrated                  AUTONOMOUS / LLM-orchestrated
│ Prompt chaining → Routing → Parallelization → Orchestrator-worker → Swarm/Blackboard │
│ cost ↑ , determinism ↓ , flexibility ↑ as you move right                              │

1. Prompt chaining / sequential pipeline

What it is. Decompose a task into fixed sequential steps, where each LLM call processes the previous call’s output. Optional programmatic gates between steps validate intermediate results before continuing.

Input → [LLM call 1] → (gate/check) → [LLM call 2] → (gate) → [LLM call 3] → Output
            │ fail
            ▼
         exit / fix

When to use. The task cleanly decomposes into fixed subtasks, and you’re happy to trade latency for higher per-step accuracy. E.g. generate marketing copy → translate it; write an outline → validate it → draft the full document.

When NOT to use. The task isn’t actually decomposable into fixed steps, or the steps vary by input. A rigid pipeline is brittle for open-ended work.

Failure modes.

Latency accumulates linearly — every step is a full round-trip.
Errors propagate downstream; a bad step-1 output poisons all later steps unless a gate catches it.
Over-rigid for tasks whose shape varies per input.

2. Routing / dispatch

What it is. Classify the input, then direct it to a specialized follow-up prompt / model / task. Gives you separation of concerns and per-category optimization.

                  ┌→ [Specialist A]
Input → [Router / │
         Classify] ┼→ [Specialist B]
                  └→ [Specialist C]

When to use. Inputs fall into distinct categories better handled separately, and classification is accurate. Two flavours:

By category — e.g. customer-service query types route to specialized prompts.
By difficulty / cost — easy → cheap model (Haiku), hard → strong model (Sonnet/Opus). A simple, effective cost lever.

When NOT to use. Category boundaries are fuzzy or overlapping, or classification can’t be done reliably — misroutes will dominate.

Failure modes.

Misclassification sends input down the wrong branch — the dominant failure.
The router is a bottleneck and single point of failure.
Fuzzy/overlapping categories degrade accuracy.

3. Parallelization (sectioning + voting)

What it is. Run LLMs simultaneously and aggregate the outputs programmatically. Two variants:

Sectioning — split into independent subtasks, run in parallel, then combine.
Voting — run the same task multiple times (varied prompts/temperatures), then aggregate (majority / threshold).

SECTIONING                              VOTING
        ┌→ [subtask 1] ┐                       ┌→ [run A] ┐
Input →split [subtask 2]→[Aggregate]→Out  Input→fan-out[run B]→[Vote/threshold]→Out
        └→ [subtask 3] ┘                       └→ [run C] ┘

When to use. Speed (independent chunks) or higher confidence (multiple perspectives). Sectioning examples: a guardrail content-screen running alongside the main response; multi-aspect evaluation. Voting examples: code-vulnerability review by several reviewers; content-appropriateness behind a vote threshold.

Distinction vs. orchestrator-worker: parallelization uses pre-defined, fixed parallel paths. Orchestrator-worker decides the subtasks dynamically at runtime.

Failure modes.

Aggregation is non-trivial — voting can wash out a correct minority answer.
Cost multiplies by N (voting especially).
Valid only when subtasks are truly independent; hidden dependencies cause inconsistent merges.

Academic analog: self-consistency decoding (aggregating multiple chain-of-thought samples) is the reasoning-level version of voting — it lowers individual error and raises confidence when paths converge.

4. Orchestrator-worker (lead agent spawns subagents)

What it is. A central lead LLM dynamically breaks the task into unpredictable subtasks, delegates each to a worker LLM, then synthesizes the results. The defining difference from parallelization: the subtasks are not predetermined — the orchestrator chooses them from the specific input.

                  ┌──────────────┐
        Input →   │ Orchestrator │  plans, decides subtasks dynamically
                  └──┬───┬───┬───┘
            spawns   │   │   │      (parallel, dynamic count)
                  ┌──▼┐ ┌▼─┐ ┌▼──┐
                  │ W1│ │W2│ │W3 │  workers: own context + own tools
                  └──┬┘ └┬─┘ └┬──┘
                     └───┼────┘
                  ┌──────▼───────┐
                  │ Orchestrator │ → synthesize → Output
                  └──────────────┘

When to use. Complex tasks where subtasks can’t be predetermined — multi-file coding changes, multi-source gathering and analysis.

Failure modes (generic). Dynamic task count can explode; coordination overhead; the synthesis step can lose detail. See §4a for what this looks like in production.

4a. Anthropic’s multi-agent research system — the flagship case study

This is the production instantiation of orchestrator-worker. The numbers here are worth memorizing.

Architecture.

Lead agent (orchestrator): analyzes the query, develops a strategy, saves its plan to memory, spawns subagents, synthesizes their results, decides whether more research is needed, compiles the final answer.
Subagents (workers): each gets its own context window and own tools; they act as “intelligent filters” — search independently, evaluate quality, identify gaps, return refined findings.
Models: Claude Opus 4 as lead, Claude Sonnet 4 as subagents.

Two levels of parallelism.

Agent-level: lead “spins up 3–5 subagents in parallel rather than serially.”
Tool-level: each subagent fires 3+ parallel tool calls. → cut research time by up to 90% on complex queries (minutes vs. hours).

Performance numbers.

Multi-agent (Opus lead + Sonnet subagents) beat single-agent Opus 4 by 90.2% on the internal research eval.
On BrowseComp, 3 factors explain 95% of performance variance; token usage alone explains 80%.
Token economics: agents use ~4× the tokens of chat; multi-agent systems use ~15× the tokens of chat.
“Upgrading to Claude Sonnet 4 is a larger performance gain than doubling the token budget on Sonnet 3.7.”

Coordination model. Currently synchronous — the lead waits for each batch of subagents to finish before continuing. Simpler to coordinate, but it creates bottlenecks: “the lead agent can’t steer subagents, subagents can’t coordinate, and the entire system can be blocked while waiting for a single subagent.”

Production failure modes observed.

Lead spawned 50 subagents for a simple query; endless searching for nonexistent sources.
Duplicate work — subagents exploring identical topics instead of dividing labor.
Excessive status updates distracting agents.
Agents are stateful and error-prone — “minor system failures can be catastrophic for agents”; errors compound across tool calls and phases.
Non-deterministic runs make debugging hard → full production tracing required.
Tool-selection bias — early agents “consistently chose SEO-optimized content farms over authoritative but less highly-ranked sources like academic PDFs or personal blogs.”

Seven prompt-engineering principles for orchestration.

Simulate agent behavior with real prompts/tools to find failure modes step-by-step.
Detailed delegation — give each subagent an objective, output format, tool guidance, and boundaries. Vague briefs (“research the semiconductor shortage”) cause misinterpretation and duplicate searches.
Scale effort to complexity — embed explicit rules: simple fact-find = 1 agent / 3–10 calls; comparisons = 2–4 subagents / 10–15 calls each; complex = 10+ subagents with divided responsibilities.
Tool-design primacy — distinct tool purposes; “bad tool descriptions can send agents down completely wrong paths.”
Agentic self-improvement — Claude can diagnose failing prompts; a tool-testing agent that rewrote tool descriptions produced a 40% decrease in task completion time.
Breadth-first then narrow — start broad, evaluate, then focus.
Extended thinking as a controllable scratchpad; interleaved thinking in subagents to refine queries.

When multi-agent is the RIGHT fit: breadth-first queries pursuing multiple independent directions at once; high-value tasks needing heavy parallelization; information that exceeds a single context window; interfacing with many complex tools.

When it is the WRONG fit: tasks needing shared context across all agents; many inter-agent dependencies; most coding tasks (fewer parallelizable parts); real-time tight coordination; low-value tasks where 15× token cost isn’t justified. “Multi-agent systems require tasks where the value of the task is high enough to pay for the increased performance.”

5. Evaluator-optimizer (generator + critic loop)

What it is. One LLM generates, a second LLM evaluates and gives feedback, in an iterative loop, until the evaluator is satisfied.

Input → [Generator] → draft → [Evaluator / Critic]
            ▲                        │
            └──── feedback (if not good) ─┘
                                     │ accepted
                                     ▼
                                  Output

When to use. Clear evaluation criteria exist and iterative refinement demonstrably improves output. Two success signals: (a) responses measurably improve with human feedback; (b) the LLM can produce meaningful critiques. Examples: literary translation needing nuanced critique; multi-round complex search/analysis.

Distinction vs. §11 reflection: here a separate critic LLM judges the generator. In reflection, the same agent critiques itself.

Failure modes.

Loops indefinitely without a stopping cap.
Evaluator is only as good as its criteria — vague criteria → useless feedback.
Two LLM calls per iteration → cost and latency multiply.
Critic and generator can “collude” on a mediocre local optimum.

6. Hierarchical / supervisor trees

What it is. A supervisor agent controls all communication and task delegation, deciding which agent to invoke from context. Hierarchical = supervisors of supervisors: agents grouped into teams, each with its own supervisor, and a top-level supervisor routing between teams. Generalizes orchestrator-worker to multiple levels.

                ┌─────────────────┐
                │  Top Supervisor │
                └───┬─────────┬───┘
            ┌───────▼──┐   ┌──▼───────┐
            │ Team A   │   │ Team B   │
            │ Supervis.│   │ Supervis.│
            └─┬──┬──┬──┘   └─┬──┬─────┘
              ▼  ▼  ▼        ▼  ▼
             a1 a2 a3       b1 b2     worker agents

When to use. Many specialized agents that group naturally into domains, and you need a clear, auditable control structure. LangGraph’s framing: a supervisor tree is “easier to reason about — one routing node, clear control flow, every decision visible in traces.” Choose it when control and observability matter more than raw speed.

Failure modes.

Supervisor is a bottleneck and single point of failure; each hop adds an LLM call (latency + cost).
Deep hierarchies amplify latency (every level = a round-trip).
The top supervisor can mis-route between teams, compounding routing error across levels.

7. Blackboard / shared-state

What it is. Agents coordinate indirectly through a shared memory medium — the “blackboard” — rather than messaging each other directly. A control agent posts a request describing the task/info needed; any subordinate agent watching the board independently decides whether it can contribute. The board holds public + private spaces with all messages, intermediate inferences, and interaction history.

   [Agent A]   [Agent B]   [Agent C]   [Agent D]
     │  ▲        │  ▲        │  ▲        │  ▲
 write│  │read   │  │        │  │        │  │
     ▼  │        ▼  │        ▼  │        ▼  │
   ┌──────────────────────────────────────────┐
   │            BLACKBOARD (shared state)       │
   │  task posts · partial solutions · history  │
   └──────────────────────────────────────────┘
                    ▲
        [Control / scheduler decides who acts next]

When to use. Open-ended problems with dynamic agent selection (you don’t know in advance which specialist is relevant); information-discovery / data-science tasks; token-sensitive settings — the board keeps “compact, high-salience messages,” reducing inference cost and propagating info efficiently. Supports robust consensus mechanisms.

Failure modes.

The control/scheduling policy (who writes next) is the hard part — poor control → thrashing or stalls.
Shared state can grow unboundedly (history bloat) → context pressure.
Concurrency / write-contention and consistency issues on the shared store.
Harder to trace causality than a supervisor tree (coordination is indirect).

8. Network / handoff (peer-to-peer swarm)

What it is. A decentralized topology — agents hand off control directly to one another based on specialization, with no central orchestrator. “Network” = many-to-many (any agent can call any other). A swarm remembers which agent was last active, so the conversation resumes with that agent next turn.

LangGraph mechanism: handoff via a Command object — Command(goto=<agent>, graph=Command.PARENT) navigates to a different node in the parent graph; state travels along.
OpenAI Agents SDK mechanism: primitives are Agents (instructions + tools) and handoffs (a function returning another Agent). A triage agent routes to a specialist who becomes the active agent and replies directly. (The experimental Swarm was the predecessor; the production successor is the OpenAI Agents SDK, released March 2025, adding guardrails, tracing, sessions.)

   [Agent A] ⇄ [Agent B]
       ⇅    ✕    ⇅          any agent can hand off to any other;
   [Agent C] ⇄ [Agent D]     no central coordinator

When to use. When the specialist should answer the user directly without a manager narrating results, and you want focused per-agent prompts. Swarms are “faster — no intermediary, direct agent-to-agent handoffs, fewer LLM calls” than supervisor trees.

Manager (agents-as-tools) vs. handoffs — OpenAI’s explicit contrast:

Manager / agents-as-tools: a central manager calls specialists via Agent.as_tool(), keeps control, owns the final answer. (≈ centralized orchestrator-worker.)

Handoffs: decentralized; control transfers to the specialist for that turn. (≈ swarm/network.)

Hybrid: a triage agent can hand off to a specialist that simultaneously calls other agents as tools.

Failure modes.

No global view — no agent sees the whole picture; hard to enforce overall goals.
Handoff loops / ping-ponging between agents.
Harder to trace and debug than centralized control.
Routing errors propagate with no supervisor to catch them.

9. Debate & ensemble voting

What it is. Multiple agents independently propose answers, then critique each other’s reasoning across rounds to reach consensus (“society of minds”). Ensemble voting is the lighter cousin: run multiple models/paths and aggregate by majority/vote. Debate induces an ensemble-like effect — diverse reasoning paths explored in parallel and cross-verified.

Round 1:  [A answer]   [B answer]   [C answer]
                │ exchange + critique │
Round 2:  [A revises]  [B revises]  [C revises]
                │   ... repeat ...    │
            → converge → [Aggregate / Judge] → Output

When to use. Reasoning-heavy tasks (math, factual QA) where cross-verification reduces hallucination; LLM-as-judge ensembles where robustness matters.

Failure modes (2024–2026 research, treat as caveats).

“Problem drift” — debates lose focus and wander off the original question over rounds (arXiv 2502.19559).
Cost/efficiency: rounds × agents = heavy token spend. EMS: Majority-then-Stopping (arXiv 2604.02863) and adaptive stability detection (arXiv 2510.12697) exist specifically to cut voting cost and decide when to halt.
Not always worth it: “Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking-Token Budgets” (arXiv 2604.02460) — at a fixed token budget, a single agent can beat debate. Key caveat: gains may come from extra compute, not the topology itself.
Agents can converge on a confidently-wrong consensus.

10. Map-reduce over large corpora

What it is. Distributed map-reduce adapted for content that exceeds the context window. Split the corpus into token-bounded chunks → Map: apply the LLM independently to each chunk (summarize/extract) → Reduce: combine the intermediate outputs, recursively feeding them back until a single consolidated result remains.

Doc ──split──┬─ chunk1 → [LLM map] → s1 ┐
             ├─ chunk2 → [LLM map] → s2 ┼→ [LLM reduce] → (recurse if needed) → Final
             └─ chunk3 → [LLM map] → s3 ┘

Contrast set (long-content strategies).

Stuffing — everything in one prompt; only if it fits.
Map-reduce — parallel, scalable.
Refine — sequential; carry a running summary chunk by chunk. Better continuity, less parallelism.

When to use. Corpus far exceeds the context window; chunks are largely independent; you want parallel throughput. Summarization, large-scale extraction/classification over many docs.

Failure modes.

Loss of global/cross-chunk context — the map step can’t see other chunks (cross-document entities/themes lost).
Reduce step can itself overflow if there are too many intermediate summaries → needs recursive/hierarchical reduce.
Information dilution compounding across reduce levels.
Inconsistent map outputs are hard to merge cleanly.

11. Loop-until-done / reflection (ReAct, Self-Refine, Reflexion)

What it is. A single agent iterates — act, observe, self-critique, refine — until a quality threshold or stop condition is met. Family members:

ReAct — interleaves reasoning (chain-of-thought) with acting (tool calls); improves interpretability and planning.
Reflection / Self-Refine — the model is both writer and editor: critiques its own output and revises across loops, no extra training.
Reflexion — extends ReAct: the agent decides whether to stop, reflects on its entire trajectory, stores verbal reflections in persistent memory, and restarts with that critique in-context. Learns across multiple trials.

        ┌───────────────────────────────────┐
Input → │ [Act / Generate] → [Observe] →     │
        │ [Self-critique] → good? ──no──┐    │
        └───────────────────────────────┘    │
                       │ yes      ▲ reflection (in-context / memory)
                       ▼          └───────────┘
                    Output

Distinction vs. §5 evaluator-optimizer: here the same agent critiques itself, often with persistent cross-trial memory; evaluator-optimizer uses a separate critic.

When to use. Tasks with verifiable feedback (tests pass/fail, tool errors, rubric); the agent benefits from learning from its own mistakes within or across attempts; iterative quality improvement on a single deliverable.

Failure modes.

Infinite / non-terminating loops without a hard iteration cap.
Self-critique is unreliable — the model may not recognize its own errors (no external ground truth).
Reflection memory accumulates noise; cost grows per trial.
Can entrench an initial wrong approach if reflections reinforce a bad frame.

12. Comparison table

Pattern	Parallelism	Determinism	Relative cost	Best for
Prompt chaining / pipeline	None (sequential)	High (fixed path)	Low	Fixed, decomposable tasks; accuracy > latency
Routing / dispatch	None (one branch)	High (after classify)	Low	Distinct input categories; cost-tiering models
Parallelization — sectioning	High (fixed branches)	High (predefined)	Medium (N×)	Independent subtasks; speed; guardrails
Parallelization — voting	High (same task ×N)	Medium (aggregation)	Medium–High (N×)	Confidence via multiple perspectives
Orchestrator-worker	High (dynamic count)	Low (LLM decides)	High (~15× chat)	Open-ended subtasks; breadth-first research; multi-file coding
Hierarchical / supervisor	Medium–High (per team)	Medium (visible flow)	High (hop per level)	Many grouped specialists; auditable control
Blackboard / shared-state	High (opportunistic)	Low–Medium (scheduler)	Low–Medium (token-efficient)	Dynamic agent selection; info discovery; token-sensitive
Network / handoff (swarm)	Medium (one active)	Low (decentralized)	Low–Medium (fewer calls)	Specialist answers directly; conversational routing
Debate / ensemble voting	High (agents parallel)	Low (consensus)	High (rounds × agents)	Reasoning/factual tasks; hallucination reduction*
Map-reduce over corpora	Very high (map)	High (structured)	Medium–High (per-chunk)	Content exceeding context window; summarization at scale
Loop-until-done / reflection	None (single loop)	Low (iterates)	Medium–High (per trial)	Verifiable-feedback tasks; iterative self-improvement

* may not beat a single agent at equal token budget — see §9.

Reading the table. Determinism falls and cost rises as you move from code-orchestrated (top) to LLM-orchestrated (bottom). Anthropic’s rule of thumb threads through every row: use the simplest pattern that works; add agency only when simpler approaches demonstrably fail; pay for high parallelism/cost only on high-value tasks.

13. Cross-cutting themes

Determinism ↔ flexibility is the master axis (Anthropic workflows-vs-agents; OpenAI code-vs-LLM orchestration). Choose your spot deliberately.
Cost scales with autonomy and parallelism: agents ≈ 4× chat tokens; multi-agent ≈ 15× chat tokens. Multi-agent is “economically viable” only on high-value tasks.
Token budget may explain the gains, not the topology: 80% of performance variance traces to token usage; a single agent can match multi-agent at equal budget on some reasoning tasks. Don’t assume more agents = better.
Observability is first-class: non-determinism makes multi-agent debugging hard → full production tracing required. Supervisor trees are often chosen precisely because “every decision is visible in traces.”
Coordination failure modes recur everywhere: duplicate work, over-spawning, drift, loops, lost shared context. Standard mitigations: detailed delegation, complexity-scaled effort rules, distinct tool descriptions, explicit stopping criteria.
Centralized vs. decentralized is the topology fork: orchestrator / supervisor / manager (control + observability, but a bottleneck) vs. swarm / network / blackboard (speed + resilience, but harder to trace).

Sources

Building Effective AI Agents — Anthropic — https://www.anthropic.com/engineering/building-effective-agents
How we built our multi-agent research system — Anthropic — https://www.anthropic.com/engineering/built-multi-agent-research-system
Multi-agent orchestration (manager vs. handoffs; LLM vs. code orchestration) — OpenAI Agents SDK — https://openai.github.io/openai-agents-python/multi_agent/
Agents (primitives: instructions, tools, handoffs) — OpenAI Agents SDK — https://openai.github.io/openai-agents-python/agents/
OpenAI Swarm (educational multi-agent framework) — https://github.com/openai/swarm
LangGraph Multi-Agent Supervisor (reference) — https://reference.langchain.com/python/langgraph-supervisor
langgraph-supervisor-py (supervisor pattern) — https://github.com/langchain-ai/langgraph-supervisor-py
langgraph-swarm-py (swarm/handoff pattern, Command/goto) — https://github.com/langchain-ai/langgraph-swarm-py
Multi-Agent Orchestration in LangGraph: Supervisor vs Swarm — DEV — https://dev.to/focused_dot_io/multi-agent-orchestration-in-langgraph-supervisor-vs-swarm-tradeoffs-and-architecture-1b7e
Swarm vs. Supervisor: Multi-Agent Architecture Guide — Augment Code — https://www.augmentcode.com/guides/swarm-vs-supervisor
LLM-based Multi-Agent Blackboard System for Information Discovery in Data Science — arXiv 2510.01285 — https://arxiv.org/html/2510.01285v1
Collaborative Problem-Solving in Multi-Agent Systems with the Blackboard Architecture — https://notes.muthu.co/2025/10/collaborative-problem-solving-in-multi-agent-systems-with-the-blackboard-architecture/
Scaling Document Summarization with LLMs: Stuffing, Map-Reduce, and Refine — Medium — https://medium.com/@sonimegha1602/scaling-document-summarization-with-llms-stuffing-map-reduce-and-refine-a8a468d479c3
Summarizing Too Big for Context with MapReduce and LLMs — Google Cloud Community — https://medium.com/google-cloud/summarizing-too-big-for-context-with-mapreduce-and-llms-6d2acc7a2ed0
Reflexion — Prompt Engineering Guide — https://www.promptingguide.ai/techniques/reflexion
Reflection Agent Pattern — Agent Patterns docs — https://agent-patterns.readthedocs.io/en/stable/patterns/reflection.html
Reflexion Agent Pattern — Agent Patterns docs — https://agent-patterns.readthedocs.io/en/stable/patterns/reflexion.html
How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI — Hugging Face — https://huggingface.co/blog/Kseniase/reflection
Stay Focused: Problem Drift in Multi-Agent Debate — arXiv 2502.19559 — https://arxiv.org/pdf/2502.19559
EMS: Multi-Agent Voting via Efficient Majority-then-Stopping — arXiv 2604.02863 — https://arxiv.org/pdf/2604.02863
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection — arXiv 2510.12697 — https://arxiv.org/html/2510.12697v1
Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets — arXiv 2604.02460 — https://arxiv.org/html/2604.02460v1
Cloudflare Agents — Anthropic patterns guide — https://github.com/cloudflare/agents/blob/main/guides/anthropic-patterns/README.md

Continue reading

🤖 Multi-Agent LLM Systems

Agent Orchestration & Multi-Agent Systems — Suite Index

🤖 Multi-Agent LLM Systems

Foundations: Agents, Agentic Loops & When to go Multi-Agent

🤖 Multi-Agent LLM Systems

Context Engineering & Memory for Agents