When One Agent Is Not Enough
A single Claude agent with 20 tools and a 10,000-token task description works — until it doesn't. Long contexts degrade instruction-following. Too many tools dilutes selection quality. Tasks with clearly separable concerns (research, write, review) benefit from dedicated agents with focused system prompts and trimmed tool sets.
The rule of thumb: split when a single agent's context would need to hold the full state of two unrelated workstreams simultaneously, or when you want different reliability/cost tradeoffs per subtask.
The Orchestrator/Subagent Pattern
An orchestrator agent receives the high-level task, decomposes it, delegates to specialist subagents, collects results, and synthesises the final output. Subagents receive narrow, well-defined tasks and return structured results.
The orchestrator calls subagents via tools. From Claude's perspective, a subagent is just another tool that takes a task description and returns a result string.
import anthropic
import json
client = anthropic.Anthropic()
# --- Subagent: Researcher ---
RESEARCHER_SYSTEM = (
'You are a research specialist. Given a topic, search the web and return '
'a concise JSON summary with keys: summary, key_facts (list), sources (list of URLs).'
)
def run_researcher(topic: str) -> str:
messages = [{'role': 'user', 'content': f'Research this topic: {topic}'}]
response, _ = run_agent(messages, search_tools, system=RESEARCHER_SYSTEM)
# Return the text content as a string result
for block in response.content:
if hasattr(block, 'text'):
return block.text
return 'No research result'
# --- Subagent: Writer ---
WRITER_SYSTEM = (
'You are a content writer. Given a research summary, write a 400-word '
'blog section in markdown. Be factual, clear, and cite sources inline.'
)
def run_writer(research_summary: str, section_title: str) -> str:
prompt = f'Section title: {section_title}\n\nResearch:\n{research_summary}'
messages = [{'role': 'user', 'content': prompt}]
response, _ = run_agent(messages, [], system=WRITER_SYSTEM)
for block in response.content:
if hasattr(block, 'text'):
return block.text
return ''
Wiring the Orchestrator
The orchestrator has tools that call subagents. It never directly calls the web search API or the writing logic — it delegates. This keeps the orchestrator's context lean and its tools list short.
ORCHESTRATOR_TOOLS = [
{
'name': 'research_topic',
'description': 'Research a topic on the web and return a structured summary.',
'input_schema': {
'type': 'object',
'properties': {
'topic': {'type': 'string', 'description': 'The topic to research'},
},
'required': ['topic'],
},
},
{
'name': 'write_section',
'description': 'Write a blog section given a research summary and section title.',
'input_schema': {
'type': 'object',
'properties': {
'research_summary': {'type': 'string'},
'section_title': {'type': 'string'},
},
'required': ['research_summary', 'section_title'],
},
},
]
def orchestrator_execute_tool(name, inputs):
if name == 'research_topic':
return run_researcher(inputs['topic'])
elif name == 'write_section':
return run_writer(inputs['research_summary'], inputs['section_title'])
return f'Unknown tool: {name}'
Pass structured data (JSON strings) between agents rather than free-form text. The researcher returns a JSON dict; the writer receives it as a string but can parse it. This makes the pipeline testable: you can unit-test each agent independently with a fixed input.Parallel Subagent Execution
If subagents are independent (no output from one feeds another), run them in parallel with concurrent.futures. The orchestrator still calls them via its tool loop — but your tool executor runs them concurrently.
from concurrent.futures import ThreadPoolExecutor, as_completed
def run_parallel_research(topics: list[str]) -> dict[str, str]:
results = {}
with ThreadPoolExecutor(max_workers=5) as executor:
futures = {executor.submit(run_researcher, t): t for t in topics}
for future in as_completed(futures):
topic = futures[future]
try:
results[topic] = future.result()
except Exception as e:
results[topic] = f'Research failed: {e}'
return results
Passing Context Between Agents
Agents have no shared state. Everything a subagent needs must be in its initial message or system prompt. If the orchestrator has gathered context over multiple turns, summarise it before passing to a subagent — do not pass the entire conversation history.
| What to pass | How | What NOT to do |
|---|---|---|
| Task inputs | Embed in the user message | Pass raw conversation history |
| Shared config (company name, tone guidelines) | System prompt | Repeat in every tool result |
| Results from previous subagents | Summarised string in user message | Pass the full raw output if it is thousands of tokens |
| Structured data (IDs, schemas) | JSON string in user message | Rely on Claude to remember across agent boundaries |
When NOT to Use Multiple Agents
Multi-agent systems add complexity, latency, and cost. Do not split into multiple agents just because a task has multiple steps. A single agent with a well-structured system prompt handles most tasks up to 10,000-15,000 tokens of context.
- Use one agent when steps are sequential and share state (each step's output is the next step's input)
- Use one agent when the total context fits comfortably within the model's window
- Use multiple agents when tasks are truly parallel and independent
- Use multiple agents when different subtasks need different system prompts, tools, or model tiers
- Use multiple agents when you want to isolate failures (one subagent failing should not abort the whole pipeline)