Skip to content

Week 7: Multi-Agent Workflows

Overview

So far, you've learned to write single agents — one primary agent doing all the work. But real problems are rarely that simple. A code review might need three different lenses: security, style, and test impact. A documentation writer might delegate fact-checking to a specialist. A system migration might benefit from parallel exploration and execution happening at the same time.

This week, you'll learn how to compose multiple agents into a workflow. The key insight is not to build one super-smart agent, but rather to build a primary agent that orchestrates several smaller, specialized agents. This pattern — called the primary-agent-as-orchestrator — buys you three things:

  1. Context isolation: Each agent has its own message history, so tasks don't interfere with each other.
  2. Role specialization: A "security checker" agent can be read-only; a "code fixer" agent has write permissions. No agent is overprovisioned.
  3. Parallel work: Two subagents can explore different parts of a problem at the same time.

By the end of this week, you'll understand when to split work across agents, how to invoke subagents (via @mention and the task tool), what information flows between agents, and what can go wrong when agents collide.


Concept 1: Why Use Multiple Agents?

Single-Agent Trap

A single agent doing everything runs into limits:

  • Context bloat: If the agent loads all codebase context, background info, and state, the context window fills fast. By the time you ask it to write code, it's forgotten the top of the conversation.
  • Permission creep: You want the agent to read docs but not run destructive commands. A single agent with full permissions is either too weak (can't do the work) or too powerful (can delete the repo).
  • Sequential bottleneck: If one step blocks, everything stalls. A security check, then a fix, then a style pass — done in order, by one agent, is slow.

Multi-Agent Wins

With multiple agents:

  • Small, focused context: Each agent only carries what it needs. The "security checker" doesn't load all the docs; the "docs writer" doesn't load the test suite.
  • Least-privilege permissions: The code reviewer is read-only. The test runner can bash, but not edit. The fixer can edit, but not delete. Each agent has exactly the permissions it needs.
  • Parallelism: An explorer agent can investigate branch A while a builder agent works on branch B. No waiting.

Real Example: Code Review Workflow

Imagine you're reviewing a PR. Naively, one agent reads the PR, flags issues, and fixes them. But that mixes concerns:

  • Identifying issues is read-heavy (compare old/new, check conventions, audit for secrets).
  • Fixing issues is write-heavy (edit files, run tests, commit changes).
  • A single agent doing both either wastes time reading when it should be writing, or misses issues because it's rushing to fix.

Better: Three agents.

  1. Reviewer (read-only, primary): Reads the PR diff, identifies issues.
  2. Security checker (read-only, subagent): Dives into secrets/vulns. Reports back.
  3. Fixer (write-enabled, subagent): Applies fixes the primary suggests. Runs tests. Reports success/failure.

The primary agent orchestrates: "Checker, audit for secrets. Fixer, fix the style issues I found."


Concept 2: The Primary-Agent-as-Orchestrator Pattern

In a multi-agent workflow, one agent is the primary (the one you interact with), and the others are subagents (the primary summons them). This is not a strict hierarchy — it's a role.

The Orchestrator's Job

The primary agent:

  • Decides what work to delegate: "This looks like a security issue; I'll ask the security subagent."
  • Interprets subagent results: "The security subagent found a hardcoded key. I'll ask the fixer subagent to remove it."
  • Synthesizes findings: After talking to three subagents, the primary summarizes the overall verdict.
  • Ensures consistency: The primary agent owns the "story" — the beginning-to-end narrative of the task.

Subagents are fire-and-forget-ish — they do one focused task and return results. The primary agent reads those results and decides what to do next.

Structure

┌─────────────────────────────────────┐
│   PRIMARY AGENT (Orchestrator)      │
│   - Takes user request              │
│   - Delegates to subagents          │
│   - Synthesizes results             │
│   - Reports to user                 │
└─────────────────────────────────────┘
         │              │              │
         ↓              ↓              ↓
    ┌─────────┐   ┌────────┐   ┌─────────┐
    │ Subagent│   │Subagent│   │Subagent │
    │  (read) │   │ (bash) │   │ (write) │
    └─────────┘   └────────┘   └─────────┘

Concept 3: Invoking Subagents

OpenCode provides two ways to invoke a subagent:

Method 1: The @mention Syntax

In your prompt to the primary agent, mention a subagent by name:

@code-reviewer Please audit this function for security issues.

The OpenCode runtime detects the @mention, spawns that subagent, and waits for its response. The primary agent then sees the subagent's findings in the message thread.

When to use: Immediate, lightweight delegation. The primary agent doesn't plan ahead; it reacts to what it needs.

Method 2: The task Tool

The primary agent can programmatically invoke a subagent:

invoke_task(subagent_name="security-checker", prompt="Find secrets in src/")

(The exact syntax depends on OpenCode's task tool definition.)

When to use: Complex hand-offs where the primary agent carefully crafts a prompt for a subagent, or when delegating in response to a condition.

For Week 7, we'll focus on @mention — it's more intuitive to beginners. Week 10 (capstone) will use both.


Concept 4: The Handoff Contract

When a primary agent asks a subagent to do something, there's an implicit contract:

What the Primary Sends

The primary agent sends a clear, scoped prompt:

@code-reviewer Check the following function for bugs and style violations.
Function location: src/lib/utils.js, lines 45–60.
Return a bulleted list of issues, not suggested fixes.

Good hand-offs are specific (what? where?) and bounded (don't review the whole codebase, just this function).

What the Subagent Returns

The subagent returns a focused, short summary:

Found 3 issues in utils.js:45-60:
- Line 47: Unused variable `tempCount`
- Line 53: Missing null check before accessing `.length`
- Line 58: Inconsistent indentation (mix of tabs/spaces)

The subagent does not return:

  • The entire message history (too much context).
  • Suggested code (unless the primary asked for it).
  • Off-topic observations.

The Contract Matters

A bad hand-off is like a bad function call:

  • If the primary says "review this PR" and the subagent reviews three other PRs too, that's context bleed — the subagent did too much.
  • If the subagent returns 50 lines of explanation and the primary has to dig for the key finding, that's noisy output — hard to integrate.
  • If two subagents edit the same line of code, that's a conflict — no contract about who owns what.

Good contracts prevent these. More on failure modes later.


Concept 5: Failure Modes

1. Context Bleed

What happens: A subagent has access to more context than intended (e.g., the whole codebase when you only asked it to review one file).

Why it fails: The subagent gets distracted. It spends time on unrelated issues, or it hallucinates "fixes" for code it shouldn't touch.

Example:

Primary: @reviewer "Check src/main.js for bugs."
Reviewer: (loads all of src/, finds bugs in main.js, also "fixes" src/utils.js)
Primary: "Wait, I didn't ask you to touch utils.js!"

Fix: Be explicit about scope in the hand-off. If using the read tool, specify file paths narrowly.

2. Infinite Loops

What happens: Subagent A calls Primary, Primary calls Subagent B, Subagent B calls Primary, ... forever.

Why it's unlikely but possible: If the primary agent doesn't have clear exit criteria, and subagents keep finding "more work," they loop.

Example (contrived):

Primary: @linter "Fix style issues."
Linter: "Found 10 issues. Fixed 5. Ask primary for next batch."
Primary: @linter "Fix the remaining 5."
Linter: "Found 3 more. Fixed 2. Ask primary for next batch."
...

Fix: Set bounded tasks. "Fix all style issues in src/main.js" (clear endpoint), not "keep fixing until satisfied" (open-ended).

3. Conflicting Edits

What happens: Subagent A edits line 10; Subagent B also edits line 10. One overwrites the other.

Why it fails: No coordination. If two agents have write access to the same file and they don't communicate, chaos.

Example:

Primary: @formatter "Fix indentation in src/main.js"
Primary: @security-fixer "Remove secrets in src/main.js"
Formatter: (edits lines 1–50 for indentation)
Security-fixer: (edits lines 45–60 to remove a secret)
Result: Lines 45–50 are a merge conflict.

Fix: Serialize writes. Either: - Only one agent writes to a given file. - Agents take turns (Formatter first, then Security). - Use edit tool's merge capabilities (if available).

4. Noisy Hand-Offs

What happens: A subagent returns pages of context; the primary misses the key finding.

Example:

Primary: @reviewer "Check this function."
Reviewer: (returns 30 lines of explanation, 1 key bug buried in line 15)
Primary: (does not parse the finding, misses the bug)

Fix: Instruct subagents to be concise. "Return a bulleted list, max 10 bullets." Summarize before delegating to the next agent.


Demo: A Three-Agent Code Review

Let's walk through a realistic demo to cement these ideas.

Setup

  • Primary agent: build-orchestrator. Mode: primary. Permissions: read, write, bash.
  • Subagent 1: security-checker. Mode: subagent. Permissions: read only.
  • Subagent 2: fixer. Mode: subagent. Permissions: read, write, bash.

The Flow

User: "Review and fix this PR. Make sure it's secure and compiles."

Step 1: Primary orchestrator reads the PR

Primary receives: "Review and fix PR #42."
Primary (in plan mode): Reads the PR diff from src/auth.js.
Finds: Looks like a password hashing function.
Decision: This could have security issues. Also needs style review.

Step 2: Primary delegates to security subagent

Primary: @security-checker "Audit src/auth.js for secrets, weak crypto, and injection vulnerabilities."

The security-checker subagent spins up in its own session:

Security-checker reads src/auth.js.
Finds: Line 23 has hardcoded API key. Line 45 uses weak hash (MD5).
Returns: "2 findings:
  1. Hardcoded API key at line 23 (use env var).
  2. MD5 hash at line 45 (use bcrypt)."

The primary agent sees this summary in its thread.

Step 3: Primary synthesizes findings

Primary: (reviews security findings)
"I need the fixer to address these. I'll also check the code style."
Primary reads the full function again.
Finds: Inconsistent indentation, missing docstring.

Step 4: Primary delegates to fixer

Primary: @fixer "Apply these fixes to src/auth.js:
  1. Line 23: Replace hardcoded key with process.env.API_KEY.
  2. Line 45: Replace MD5 with bcrypt (see npm docs).
  3. Add a docstring at the top of the function.
  4. Fix indentation to 2 spaces.
  5. Run npm test to confirm compiles."

The fixer subagent spins up:

Fixer reads src/auth.js.
Edits line 23, line 45, adds docstring, fixes indentation.
Runs: npm test
Result: ✓ All tests pass.
Returns: "Fixes applied. Tests pass. Ready to merge."

Step 5: Primary reports

Primary (to user): "Review complete:
  ✓ Security: Fixed 2 issues (hardcoded key, weak hash).
  ✓ Style: Added docstring, fixed indentation.
  ✓ Tests: All pass.
  Ready to merge."

Message Graph

Here's what the full conversation looks like, showing who sees what:

User
  │
  ├─→ Primary (sees user request)
  │   ├─→ reads src/auth.js (plan mode)
  │   ├─→ decides to delegate
  │   │
  │   ├─→ @security-checker (delegates)
  │   │   └─→ Security-checker (spins up, sees only the request + file)
  │   │       ├─→ reads src/auth.js
  │   │       └─→ returns "2 findings"
  │   │
  │   ├─→ (Primary receives summary, does not see full security-checker conversation)
  │   ├─→ reads src/auth.js again (primary context, not subagent context)
  │   │
  │   ├─→ @fixer (delegates)
  │   │   └─→ Fixer (spins up, sees only the request)
  │   │       ├─→ reads src/auth.js
  │   │       ├─→ edits file
  │   │       ├─→ runs npm test
  │   │       └─→ returns "Fixes applied. Tests pass."
  │   │
  │   ├─→ (Primary receives summary, does not see full fixer conversation)
  │   └─→ synthesizes + reports to user

User (sees final report, not the back-and-forth)

Key observation: The user only sees the primary agent's final report. The primary agent sees brief summaries from subagents, not their full conversations. This keeps context lean and prevents the primary from being overwhelmed.


Concept 6: When to Use Multi-Agent Workflows

Not every task needs multiple agents. Here's a decision tree:

Ask these questions:

  1. Does the task have distinct roles or lenses?
  2. Review: security + style + tests? → Multi-agent.
  3. "Fix this typo"? → Single agent.

  4. Can work be parallelized?

  5. Two parts of a codebase that don't interact? → Could be two agents.
  6. One linear task? → Single agent.

  7. Do different agents need different permissions?

  8. One agent should be read-only, another should write? → Multi-agent.
  9. Both need the same permissions? → Might still be single agent.

  10. Is context window a concern?

  11. Huge codebase, and the task only touches one file? → Delegate to a focused subagent.
  12. Small codebase, all in context? → Single agent is fine.

Heuristics

  • One agent if: Task is simple, linear, and doesn't require multiple expertise areas.
  • Two agents if: You need a specialist (e.g., security checker) + a doer (fixer).
  • Three+ agents if: Task is complex, with multiple phases or lenses (review, security, style, testing).

Avoid: Over-engineering. A two-agent workflow for "fix a typo" is overkill.


Concept 7: Context Window Management

Every agent — primary and subagent — has a limited context window. This is the model's working memory. When it fills up, older messages get summarized or dropped. This is called context pressure, and it affects agent reliability.

What Happens Under Pressure

As a conversation grows: 1. Early instructions fade from memory. 2. The agent may repeat steps already completed. 3. It may forget subagent results from earlier in the workflow. 4. Responses get slower as the model processes more tokens. 5. Cost increases linearly with token count.

The autoCompact Setting

OpenCode has a built-in safety net: autoCompact. When enabled (it's on by default), OpenCode automatically summarizes older conversation turns to free space.

{
  "autoCompact": true
}

When to keep it on: Most of the time. General development, short investigations, routine tasks.

When to turn it off: Long investigations where the full history matters — security audits, legal reviews, multi-hour debugging sessions. With autoCompact off, the agent remembers everything but uses more tokens and costs more.

Strategies for Long Sessions

  1. Restart often. If a conversation exceeds 50 turns, consider starting fresh. Save important context in AGENTS.md or a summary file.
  2. Delegate to subagents early. Subagents have their own context windows. They don't bloat the primary agent's memory.
  3. Ask for summaries. Periodically ask the primary: "Summarize what we've done and what's left." This creates a compact reference the agent can use later.
  4. Structured outputs. Ask for findings in tables or bullet lists instead of paragraphs. Less tokens = more room.
  5. Use reasoning effort wisely. High reasoning effort uses more tokens per response. Switch to medium or low for straightforward tasks to save context space.

Real Example

A long debugging session might span 100+ messages. The agent starts by investigating a crash. By message 80, it's testing a fix. But — the autoCompact feature has summarized the crash investigation to "investigated crash in auth.js, found null pointer." When the fix doesn't work, the agent has lost the details needed to find the real root cause.

Fix: After 40 messages, ask the agent to save a detailed summary to a file. Then restart the conversation in a fresh session, loading the summary as context. This preserves the investigation depth without filling the context window.

Context management is a practical skill, not just a theoretical one. The best OpenCode users know when to push forward and when to reset.

Summary

  • Multi-agent workflows split work across a primary orchestrator and specialized subagents.
  • Primary agents decide what to delegate and synthesize results.
  • Subagents are invoked via @mention or the task tool; they return focused summaries.
  • Handoff contracts are clear, scoped prompts and concise results.
  • Failure modes (context bleed, loops, conflicts, noise) are preventable with good design.
  • Use multi-agent when: Distinct roles, parallelism, different permissions, or context limits justify it.

In the labs below, you'll build a two-agent workflow from scratch and diagnose a broken multi-agent run.