Building Your First Claude Agent: Tool Use, the Agent Loop, and Streaming

The Gap Nobody Talks About

Most Anthropic SDK tutorials show a single tool call. They don't show what happens when Claude calls three tools in sequence, one tool fails, or the agent needs to decide whether to keep looping. That is the agent loop — and the SDK does not manage it for you.

This article covers how to build the agent loop yourself, handle tool results correctly, stream in production, and avoid the two most common mistakes that cause agents to either loop forever or stop one step too early.

How the Agent Loop Actually Works

Claude does not autonomously execute tools. It returns a response that may contain tool_use blocks. Your code must extract those, execute the tools, and feed the results back as tool_result blocks. Then you call the API again. This cycle continues until Claude returns a stop_reason of 'end_turn' with no tool calls.

The pattern looks like this:

import anthropic
 
client = anthropic.Anthropic()
 
def run_agent(messages, tools, system=None):
    while True:
        kwargs = {
            'model': 'claude-sonnet-4-6',
            'max_tokens': 4096,
            'tools': tools,
            'messages': messages,
        }
        if system:
            kwargs['system'] = system
 
        response = client.messages.create(**kwargs)
 
        # Append Claude's response to message history
        messages.append({'role': 'assistant', 'content': response.content})
 
        # Done — no tool calls
        if response.stop_reason == 'end_turn':
            return response, messages
 
        # Extract and execute tool calls
        tool_results = []
        for block in response.content:
            if block.type == 'tool_use':
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    'type': 'tool_result',
                    'tool_use_id': block.id,
                    'content': str(result),
                })
 
        # Feed results back
        messages.append({'role': 'user', 'content': tool_results})

Always append the full response.content (not just the text) to message history. If you strip out tool_use blocks, Claude loses track of which tools it called and may hallucinate results.

Defining Tools

Tools are JSON Schema objects passed in the tools parameter. Each tool needs a name, description, and input_schema. The description is what Claude reads to decide when to call the tool — write it for Claude, not for humans.

tools = [
    {
        'name': 'search_web',
        'description': (
            'Search the web for current information. Use when the user asks about '
            'recent events, prices, or anything that may have changed since your '
            'training cutoff. Returns a list of snippets with source URLs.'
        ),
        'input_schema': {
            'type': 'object',
            'properties': {
                'query': {
                    'type': 'string',
                    'description': 'The search query',
                },
                'max_results': {
                    'type': 'integer',
                    'description': 'Maximum results to return (1-10)',
                    'default': 5,
                },
            },
            'required': ['query'],
        },
    },
    {
        'name': 'read_file',
        'description': 'Read the contents of a local file by path.',
        'input_schema': {
            'type': 'object',
            'properties': {
                'path': {'type': 'string', 'description': 'Absolute file path'},
            },
            'required': ['path'],
        },
    },
]

Handling Tool Errors Correctly

When a tool fails, do not raise an exception — feed the error back as a tool_result. Claude will read the error message and decide what to do next (retry with different inputs, try another tool, or report the failure to the user).

def execute_tool(name: str, inputs: dict) -> str:
    try:
        if name == 'search_web':
            return search_web(inputs['query'], inputs.get('max_results', 5))
        elif name == 'read_file':
            with open(inputs['path']) as f:
                return f.read()
        else:
            return f'Error: Unknown tool {name}'
    except FileNotFoundError:
        return f'Error: File not found at {inputs.get("path", "")}'
    except Exception as e:
        return f'Error: {type(e).__name__}: {str(e)}'

Returning raw exceptions to Claude works, but be careful with exceptions that contain sensitive information (database connection strings, API keys in stack traces). Sanitise error messages before returning them as tool results.

Streaming Agent Responses

For production UIs you want to stream Claude's thinking and final text as it arrives, not wait for the full response. Use client.messages.stream() with a context manager. Tool execution still happens synchronously between API calls.

def run_agent_streaming(messages, tools, on_text=None):
    while True:
        with client.messages.stream(
            model='claude-sonnet-4-6',
            max_tokens=4096,
            tools=tools,
            messages=messages,
        ) as stream:
            # Stream text tokens as they arrive
            for text in stream.text_stream:
                if on_text:
                    on_text(text)
 
            # Get the final message once streaming completes
            response = stream.get_final_message()
 
        messages.append({'role': 'assistant', 'content': response.content})
 
        if response.stop_reason == 'end_turn':
            return response, messages
 
        # Execute tools (no streaming during tool execution)
        tool_results = []
        for block in response.content:
            if block.type == 'tool_use':
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    'type': 'tool_result',
                    'tool_use_id': block.id,
                    'content': str(result),
                })
        messages.append({'role': 'user', 'content': tool_results})

Preventing Infinite Loops

Add a max_turns safety limit. Without it, a poorly scoped task or a tool that always returns errors can run indefinitely and burn through your API budget.

def run_agent(messages, tools, system=None, max_turns=20):
    turns = 0
    while turns < max_turns:
        turns += 1
        response = client.messages.create(
            model='claude-sonnet-4-6',
            max_tokens=4096,
            tools=tools,
            messages=messages,
            system=system or '',
        )
        messages.append({'role': 'assistant', 'content': response.content})
        if response.stop_reason == 'end_turn':
            return response, messages
        tool_results = []
        for block in response.content:
            if block.type == 'tool_use':
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    'type': 'tool_result',
                    'tool_use_id': block.id,
                    'content': str(result),
                })
        messages.append({'role': 'user', 'content': tool_results})
    raise RuntimeError(f'Agent exceeded {max_turns} turns without finishing')

Common Mistakes

Mistake	What Happens	Fix
Appending only text to history	Claude thinks it never called any tools, may re-call them or hallucinate results	Append response.content (the full list), not just the text blocks
Not handling tool_use when stop_reason is 'tool_use'	Agent stops after first tool call and never finishes the task	Loop until stop_reason is 'end_turn' AND content has no tool_use blocks
Raising exceptions from tools	Unhandled exception crashes the agent mid-task	Catch all exceptions and return error strings as tool_result content
No max_turns limit	Runaway agent in edge cases	Always set max_turns; 10-25 is reasonable for most tasks