Skip to main content
noteFebruary 8, 2026

From Tasks to Swarms: Agent Teams in Claude Code

Claude Code's agent teams upgrade the subagent workflow from star topology to mesh — agents can now message each other, coordinate through shared task lists, and collaborate in real time. Three real sessions show the patterns in action.

In my previous post on Spec Driven Development: When Architecture Becomes Executable, I showed how Claude Code's task system turns a single AI session into an orchestrated development team — subagents doing the work, tasks persisted to disk, atomic commits per task. That workflow let me migrate a storage layer from SQLite to IndexedDB in one afternoon.

This week, Anthropic shipped something that makes that workflow look like a warmup.

Agent teams landed with Opus 4.6 on February 5, 2026. The core idea: agents can now talk to each other. Not just report results back to a parent — they message peers, share discoveries mid-task, challenge each other's approaches, and coordinate through a shared task list. It's the swarm pattern, built into Claude Code as a first-class feature.

The Claude Code's New Task System Explained I covered last time solved context rot — each subagent gets a fresh context window. Agent teams solve the next problem: coordination rot. When parallel agents work on different assumptions with no way to sync, you get merge conflicts, duplicated work, and inconsistent implementations. The inbox fixes that.

I've been using agent teams daily since they dropped. This post documents three real sessions — the exact prompts, what happened, and what I learned.

What Changed: The Two New Primitives

The previous post covered the task system (TaskCreate, TaskUpdate, TaskList, TaskGet). Agent teams add two things on top:

Primitive 1: Teams. TeamCreate initializes a named team with a shared task directory. All teammates read from and write to the same board. TeamDelete cleans up when you're done.

Primitive 2: The Mailbox. SendMessage is the communication backbone. Three message types:

TypePurpose
messageDirect message to a specific teammate
broadcastMessage all teammates (expensive — use sparingly)
shutdown_requestGraceful teardown when work is complete

Plus plan_approval_response for the team lead to approve or reject a teammate's implementation plan before they start coding.

How to Enable

// .claude/settings.json
{
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
  }
}

Subagents vs. Agent Teams

AspectSubagents (Previous Post)Agent Teams (New)
LifecycleSpawn, execute, return, dieSpawn, persist, work across multiple tasks
CommunicationReport results to parent onlyMessage any teammate directly
CoordinationMain agent manages everythingShared task list + peer-to-peer messaging
TopologyStar (hub and spoke)Mesh (any-to-any)
Best forFocused, independent workCross-cutting work requiring collaboration

Session 1: The Review Board (10 Agents, 62 Fixes)

My Astro blog has 10 feature modules. I wanted a quality audit of all of them — simultaneously.

The Prompt

use a team of agents that will a code review regarding quality and simplification for the whole codebase they should report you back use one per feature

What Happened

Claude created a team called codebase-review, then:

  1. Created 10 tasks — one per feature module (agent-teams, filetree, goals, llm-education, local-first, mdx-components, og-images, presentation, prompts, vue-demos)
  2. Spawned 10 reviewer agents in parallel, each assigned to one feature
  3. Each reviewer read all files in their feature, then sent a detailed report back via SendMessage
  4. Reports arrived as they completed — some in seconds, larger features took longer

Here's what the team lead saw in real-time:

reviewer-agent-teams:  4 issues found (placeholder components shipped)
reviewer-goals:        6 issues (unused barrel export, undefined CSS)
reviewer-vue-demos:    6 issues (null-safety bugs, massive duplication)
reviewer-llm-education: 5 issues (1073-line file needs decomposition)
reviewer-og-images:    7 issues (font loading bug, inconsistent exports)
...

After all reviews came in, the team lead spawned 10 fix agents — one per feature — to implement the suggested changes. Progress was visible in real-time:

5 of 10 agents done:
  fixer-vue-demos:      7 fixes applied
  fixer-mdx-components: 8 fixes applied
  fixer-local-first:    5 fixes applied
  fixer-goals:          6 fixes applied
  fixer-agent-teams:    4 fixes applied
  fixer-filetree:       Running...
  fixer-llm-education:  Running...
  fixer-og-images:      Running...
  fixer-presentation:   Running...
  fixer-prompts:        Running...

When all 10 were done, the team lead broadcast a shutdown:

SendMessage({ type: "broadcast", content: "All work complete. Please shut down." })

Then ran verification and cleaned up:

Typecheck: 0 errors
Lint: 0 errors
Tests: 363/363 passed

TeamDelete()

The Results

FeatureFixesHighlights
local-first5Fixed broken version comparison, wrong status element ID
vue-demos6Deleted non-functional TeleportDemo, added null checks
mdx-components8Fixed ChatUI duplicate ID, deleted empty file, typed props
goals6Fixed undefined CSS, replaced JSON.parse with structuredClone
agent-teams4Deleted placeholder components, shared TaskStatus type
filetree5Deduplicated 70 lines of icon rendering, shared TreeNode type
llm-education5Extracted 8 inner components from a 1073-line file
og-images7Fixed font loading bug, shared theme, consistent exports
presentation8Deleted dead hook, deduplicated types/constants/utils
prompts8Fixed PromptTool type, deduplicated utils, capped animation

62 fixes across 10 features. Zero regressions.

Session 2: Parallel Implementation (4 Agents, 1 Dependency Chain)

For my Excalidraw clone built with Nuxt, I needed to implement Phase 2: arrow subtypes and arrowheads. The spec defined four agents with a dependency:

Agent A (Data Model) ──┬──> Agent B (Rendering — curves + arrowheads)
                       ├──> Agent C (Tool cycling + wiring)
                       └──> Agent D (Tests)

Agent A had to finish first. B, C, D could run in parallel after.

What Happened

The team lead made a strategic choice: it did Agent A's work itself rather than delegating it. Since everything depended on the data model, the lead wanted to be certain it was right before spawning parallel workers.

After completing Agent A (expanding ArrowheadType to 11 types, adding ArrowSubtype, updating createElement/mutateElement), it spawned B, C, D simultaneously.

The lead monitored type diagnostics in real-time and could see intermediate states:

All three agents are progressing well. The diagnostics are expected
intermediate states:

- renderElement.ts line 26: Agent B already added the 3rd argument
  to renderArrowheads but hasn't finished rewriting arrowhead.ts
  to accept it yet — still in progress.

- useTool.ts unused imports: Agent C added imports/constants but
  hasn't wired them into the keydown handler yet — still in progress.

When Agent C finished first, the lead noticed a type error from cross-agent changes and fixed it immediately rather than waiting. This is the key advantage over subagents — the lead has visibility into intermediate states and can intervene.

The Results

  • Agent A (Data Model): 11 arrowhead types, ArrowSubtype discriminated union, useArrowDefaults composable
  • Agent B (Rendering): Bezier curves for round arrows, 11 arrowhead SVG renderers
  • Agent C (Tool Cycling): Keyboard shortcuts, subtype cycling, arrowhead picker UI
  • Agent D (Tests): Full test coverage for all new types and interactions

Key insight: The team lead doing foundational work itself, then delegating parallel work, is a pattern that doesn't exist with subagents. With subagents, you either delegate everything or do everything yourself. Agent teams let the lead be a player-coach.

Session 3: Vue Best Practices Audit (Multi-Lens Review)

For the same Excalidraw project, I wanted a different kind of review — not code quality, but Vue-specific best practices.

The Prompt

do a review of this code base check if its really using vue in its best practices use a team of agents i like vue code like michael thiessen evan you anthu fu

What Happened

Claude spawned three specialized review agents:

  1. Vue best practices reviewer — Component patterns, reactivity, Composition API usage
  2. VueUse opportunities reviewer — Finding places where VueUse composables could replace manual implementations
  3. Architecture boundary reviewer — Feature isolation, import rules, dependency direction

Each agent independently read the entire codebase and reported findings. The multi-lens approach caught different issues than a single-pass review would.

The Team Lifecycle

Here's the full lifecycle, distilled from these sessions:

1. TeamCreate("codebase-review")     → Team + shared task list
2. TaskCreate (x N)                  → Work items with dependencies
3. Task(spawn teammates)             → Each gets fresh context + CLAUDE.md
4. TaskUpdate(owner: "reviewer-X")   → Assign work
5. Teammates read, implement, message → Peer coordination via inbox
6. SendMessage(broadcast: "shutdown") → Graceful teardown
7. Verification                      → typecheck + lint + tests
8. TeamDelete()                      → Clean up directories

Prompt Patterns

Building on the patterns from the previous post, here are the new ones for agent teams:

1. One Agent Per Unit

use a team of agents... use one per feature

Maps naturally to feature modules, components, or bounded contexts. Each agent owns a clear scope.

2. Report Back Pattern

they should report you back

Review agents message the team lead with findings. The lead aggregates, decides what to fix, then spawns fix agents.

3. Multi-Lens Review

use a team of agents... vue best practices... michael thiessen

Spawn specialized reviewers with different expertise. Each catches issues the others miss.

4. Player-Coach Lead

Don't always delegate everything. For foundational work that everything depends on, the team lead can implement it directly, then delegate parallel work after.

5. Broadcast Shutdown

SendMessage({ type: "broadcast", content: "All work complete. Please shut down." })

Always shut down teammates before calling TeamDelete. The team lead handles the full lifecycle.

The Spectrum: From Subagents to Gas Town

Where do agent teams sit in the broader landscape?

SubagentsAgent TeamsRalphGas Town
Scale1-5 tasks3-15 agents1 loop, days20-30 agents
CommunicationNone (return only)Peer messagingNone (file only)Structured mailboxes
PersistenceSession-scopedTeam-scopedFile-scopedGit-backed
CoordinationStar topologyMesh topologyStateless loopHierarchical roles
Best forIndependent tasksCollaborative workGrinding a backlogIndustrial-scale projects

Subagents (previous post): spawn, execute, return. Perfect for focused work.

Agent teams (this post): persistent teammates that communicate. For work requiring coordination.

Ralph (while :; do cat PROMPT.md | claude-code; done): stateless bash loop. No coordinator, no communication. Best for grinding through a well-defined task list over days. See The Ralph Wiggum Loop from First Principles.

Gas Town (Steve Yegge): 20-30 agents with specialized roles (Mayor, Polecats, Witness, Refinery). Git-backed persistent state. For when agent teams aren't enough. See Welcome to Gas Town.

The C compiler proof-of-concept from Nicholas Carlini at Anthropic showed what's possible at scale: 16 parallel Claude agents, ~2,000 sessions, $20,000 in API costs, producing a 100,000-line Rust compiler that compiles the Linux kernel. The takeaway: the verifier matters more than the agent. His CI pipeline and GCC torture test suite were the real differentiator. See Building a C Compiler with a Team of Parallel Claudes.

When to Use Agent Teams

Sweet spots:

  • Parallel code reviews (one agent per module)
  • Cross-layer implementation (frontend + backend + tests)
  • Competing hypotheses (Anthropic Just Dropped Agent Swarms calls this the "devil's advocate pattern")
  • Large refactors where tasks aren't fully independent

When to skip:

  • Sequential dependencies (step 2 needs step 1 to finish)
  • Same-file edits (agents overwrite each other)
  • Simple tasks (coordination overhead > actual work)

Known limitations (as of Feb 2026):

  • Session resumption can break in-process teammates
  • One team per session
  • Split-pane mode doesn't work in VS Code terminal or Ghostty

From the Previous Post to This One

Spec-Driven Development with subagents was a star pattern: I was the product owner, Claude was the tech lead, subagents were developers. All communication flowed through the center.

Agent teams turn this into a mesh: the tech lead still coordinates, but developers can talk to each other. The frontend agent asks the backend agent about the API contract. The test agent tells the implementation agent about a failing edge case. The spec is still the source of truth — but now agents can resolve ambiguities without escalating everything to the lead.

The spec-driven workflow from the previous post still applies. Research with parallel agents. Write the spec. Refine via interview. Then implement — but now with a team instead of isolated subagents.

Further Reading

Conclusion

Two weeks ago, subagents gave us parallel execution with fresh context windows. Agent teams give us parallel collaboration — agents that share discoveries, challenge assumptions, and coordinate in real time.

The trajectory is clear. Subagents were workers. Agent teams are colleagues. Gas Town is the factory. We're moving from "AI that writes code" to "AI that organizes work."

Try it yourself: Enable CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS, open a project with multiple modules, and prompt "use a team of agents, one per module." Watch the inbox light up.

Connections