How to Build a Multi-Agent Workflow Orchestrator with Claude API (Step-by-Step)

← Back to Blog

If you've been searching for a real how to build AI agents tutorial — not a toy demo, but something you could actually deploy — you're in the right place. Most tutorials stop at a single prompt-response loop. What we're building here is a proper multi-agent system where a supervisor orchestrates specialized sub-agents, delegates tasks, and synthesizes results. This is the foundation of every serious agentic workflow we build at Naples AI for clients in real estate, healthcare, and manufacturing.

What You'll Build

You'll build a multi-agent workflow orchestrator using the Anthropic Python SDK and Claude claude-sonnet-4-6. The system includes an Orchestrator agent that breaks down complex tasks and delegates subtasks to three specialized agents: a Researcher, an Analyst, and a Writer.

When you give it a goal like "Analyze the Naples real estate market and write a short investment brief," it routes work automatically, collects results, and hands you a finished output. By the end you'll have a working, extensible pattern you can adapt to your own use cases.

Prerequisites

Python 3.10 or higher installed
An Anthropic API key (get one at console.anthropic.com)
anthropic Python SDK installed (pip install anthropic)
Basic familiarity with Python classes and functions
A terminal and a code editor (VS Code works great)

📦 Full Source Code
The complete, working code is built step by step in the sections below. Each snippet builds on the last, so by Step 5 you'll have the entire runnable system. Copy the final orchestrator.py file, drop in your API key, and run it.

Step 1: Set Up Your Claude API Environment and Dependencies

First, let's get the environment wired up. Create a new project folder, drop in a virtual environment, and install the SDK.

terminal

mkdir multi-agent-orchestrator
cd multi-agent-orchestrator
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install anthropic python-dotenv

Now create a .env file in the project root to hold your API key. Never hardcode credentials in source files.

.env

ANTHROPIC_API_KEY=sk-ant-your-key-here

Now let's verify the connection works before we build anything else.

verify_setup.py

import os
from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=64,
    messages=[{"role": "user", "content": "Say: setup confirmed"}]
)

print(response.content[0].text)

Run it with python verify_setup.py and you should see "Setup confirmed" printed back. If you get an auth error, double-check your API key in .env.

Step 2: Define Tool Schemas for Multi-Agent Communication

This is where multi-agent systems get interesting. Each sub-agent is exposed to the Orchestrator as a tool — a structured function that Claude can call by name. The Orchestrator doesn't know how the agents work internally; it just knows what inputs to send and what it'll get back.

Think of it like a manager handing off tasks to staff without micromanaging. Here are the three tool definitions — one per specialized agent.

tools.py

import os
from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

# The client is shared across all agents
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

# Tool schemas tell the Orchestrator what agents exist and how to call them
AGENT_TOOLS = [
    {
        "name": "researcher_agent",
        "description": (
            "Gathers and summarizes factual information on a topic. "
            "Use this when you need background research, market data, "
            "or factual context before analysis or writing."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "task": {
                    "type": "string",
                    "description": "The specific research question or topic to investigate."
                }
            },
            "required": ["task"]
        }
    },
    {
        "name": "analyst_agent",
        "description": (
            "Analyzes data, findings, or text and produces structured insights, "
            "comparisons, or recommendations. Use after research is complete."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "task": {
                    "type": "string",
                    "description": "The analysis task, including any data or context to analyze."
                }
            },
            "required": ["task"]
        }
    },
    {
        "name": "writer_agent",
        "description": (
            "Writes polished, audience-appropriate content such as reports, briefs, "
            "summaries, or email copy. Use after research and analysis are done."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "task": {
                    "type": "string",
                    "description": "Writing task with full context, tone, and audience details."
                }
            },
            "required": ["task"]
        }
    }
]

💡 Why tool schemas matter: The description field is what Claude reads to decide which agent to call and when. Write it like you're explaining the role to a new hire. Vague descriptions lead to wrong delegation decisions.

Step 3: Create Individual Agent Classes with Specialized Roles

Each sub-agent is a Python class with its own system prompt that defines its personality and constraints. The Researcher focuses only on facts. The Analyst only interprets. The Writer only writes. Keeping roles clean is what makes the outputs actually useful.

agents.py

import os
from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

MODEL = "claude-sonnet-4-6"


class ResearcherAgent:
    """Gathers factual context on any topic."""

    SYSTEM_PROMPT = (
        "You are a research specialist. Your only job is to gather and summarize "
        "factual information clearly and concisely. Do not analyze or editorialize. "
        "Return structured findings with clear section headers."
    )

    def run(self, task: str) -> str:
        response = client.messages.create(
            model=MODEL,
            max_tokens=1024,
            system=self.SYSTEM_PROMPT,
            messages=[{"role": "user", "content": task}]
        )
        return response.content[0].text


class AnalystAgent:
    """Interprets research and produces insights or recommendations."""

    SYSTEM_PROMPT = (
        "You are a senior analyst. You receive research findings or raw data and "
        "produce clear, structured analysis: key trends, risks, opportunities, and "
        "a concise recommendation. Be direct and specific — no filler."
    )

    def run(self, task: str) -> str:
        response = client.messages.create(
            model=MODEL,
            max_tokens=1024,
            system=self.SYSTEM_PROMPT,
            messages=[{"role": "user", "content": task}]
        )
        return response.content[0].text


class WriterAgent:
    """Produces polished final content from analysis and context."""

    SYSTEM_PROMPT = (
        "You are a professional business writer. You turn research summaries and "
        "analysis into clean, polished written content. Match the tone and format "
        "specified in the task. No corporate jargon — write like a smart human."
    )

    def run(self, task: str) -> str:
        response = client.messages.create(
            model=MODEL,
            max_tokens=1024,
            system=self.SYSTEM_PROMPT,
            messages=[{"role": "user", "content": task}]
        )
        return response.content[0].text


# Registry maps tool names to agent instances — used by the Orchestrator
AGENT_REGISTRY = {
    "researcher_agent": ResearcherAgent(),
    "analyst_agent":    AnalystAgent(),
    "writer_agent":     WriterAgent(),
}

The AGENT_REGISTRY dict at the bottom is the glue. When the Orchestrator calls a tool by name, we look it up here and run it. Simple and easy to extend — just add a new class and register it.

Step 4: Build the Orchestrator Agent that Delegates Tasks

The Orchestrator is the brains of the whole system. It receives the high-level goal, decides which agents to call and in what order, and assembles the final result. It never does the actual research, analysis, or writing itself — it delegates everything.

orchestrator.py

import os
import json
from anthropic import Anthropic
from dotenv import load_dotenv
from tools import AGENT_TOOLS, client
from agents import AGENT_REGISTRY

load_dotenv()

MODEL = "claude-sonnet-4-6"

ORCHESTRATOR_SYSTEM = """
You are a workflow orchestrator. You receive complex goals and break them into
subtasks that you delegate to specialized agents using the tools available to you.

Follow this general sequence unless the task clearly requires otherwise:
1. Use researcher_agent to gather factual background.
2. Use analyst_agent to interpret the research findings.
3. Use writer_agent to produce the final output.

Always pass full context to each agent — they have no memory of prior steps.
When all subtasks are complete, synthesize and return the final result to the user.
"""


class OrchestratorAgent:
    """
    Routes tasks to specialized sub-agents using Claude's tool use feature.
    Maintains a turn-by-turn conversation history so Claude can track progress.
    """

    def __init__(self):
        self.conversation: list[dict] = []

    def run(self, goal: str) -> str:
        print(f"\n🎯 Goal received: {goal}\n")

        # Seed the conversation with the user's goal
        self.conversation.append({"role": "user", "content": goal})

        # Delegation loop — runs until Claude stops calling tools
        turn = 0
        max_turns = 10  # Safety cap to prevent runaway loops

        while turn < max_turns:
            turn += 1
            print(f"--- Turn {turn} ---")

            response = client.messages.create(
                model=MODEL,
                max_tokens=4096,
                system=ORCHESTRATOR_SYSTEM,
                tools=AGENT_TOOLS,
                messages=self.conversation
            )

            print(f"Stop reason: {response.stop_reason}")

            # If Claude is done delegating, extract and return the final text
            if response.stop_reason == "end_turn":
                final_text = self._extract_text(response)
                print(f"\n✅ Orchestrator complete after {turn} turn(s).\n")
                return final_text

            # If Claude wants to call one or more tools, handle them
            if response.stop_reason == "tool_use":
                # Append Claude's full response (including tool_use blocks) to history
                self.conversation.append({
                    "role": "assistant",
                    "content": response.content
                })

                # Collect results for all tool calls in this turn
                tool_results = []
                for block in response.content:
                    if block.type == "tool_use":
                        result = self._dispatch_tool(block.name, block.input, block.id)
                        tool_results.append(result)

                # Return all tool results to Claude in a single user message
                self.conversation.append({
                    "role": "user",
                    "content": tool_results
                })

        return "Orchestration stopped: max turns reached without a final answer."

    def _dispatch_tool(self, tool_name: str, tool_input: dict, tool_use_id: str) -> dict:
        """Looks up the agent by tool name and runs it with the provided input."""
        print(f"  🔧 Calling tool: {tool_name}")
        print(f"     Input: {json.dumps(tool_input, indent=2)[:200]}")

        agent = AGENT_REGISTRY.get(tool_name)
        if agent is None:
            output = f"Error: No agent registered for tool '{tool_name}'"
        else:
            output = agent.run(tool_input["task"])

        print(f"     Output preview: {output[:150]}...\n")

        # tool_result must reference the tool_use_id from Claude's request
        return {
            "type": "tool_result",
            "tool_use_id": tool_use_id,
            "content": output
        }

    def _extract_text(self, response) -> str:
        """Pulls plain text out of Claude's final response content blocks."""
        parts = []
        for block in response.content:
            if hasattr(block, "text"):
                parts.append(block.text)
        return "\n".join(parts).strip()

⚠️ Critical detail: When Claude returns stop_reason: "tool_use", you must append its full response to the conversation before adding tool results. If you skip this step, you'll get an API error about missing tool result blocks. The order is: Claude's response → your tool execution → tool results back to Claude.

Step 5: Implement the Execution Loop and Error Handling

Now we wire everything together with a main runner, add error handling around the API calls, and show the final output. This is the file you actually run.

main.py

import os
import sys
from anthropic import APIStatusError, APIConnectionError, RateLimitError
from dotenv import load_dotenv
from orchestrator import OrchestratorAgent

load_dotenv()


def run_workflow(goal: str) -> None:
    """Entry point for the multi-agent workflow."""
    orchestrator = OrchestratorAgent()

    try:
        result = orchestrator.run(goal)
        print("=" * 60)
        print("FINAL OUTPUT")
        print("=" * 60)
        print(result)

    except RateLimitError:
        print("Rate limit hit. Wait a moment and retry.")
        sys.exit(1)

    except APIConnectionError as e:
        print(f"Connection error: {e}. Check your network and API endpoint.")
        sys.exit(1)

    except APIStatusError as e:
        # Catches 400, 401, 500-level errors from the Anthropic API
        print(f"API error {e.status_code}: {e.message}")
        sys.exit(1)


if __name__ == "__main__":
    goal = (
        "Research the current Naples, Florida real estate market, "
        "analyze whether it's a good time for investors to buy, "
        "and write a concise 200-word investment brief for a potential buyer."
    )
    run_workflow(goal)

Run it with python main.py. You'll see the turn-by-turn delegation printed to the console as it happens, followed by the finished brief.

Here's what the output looks like when the system runs successfully:

example output (terminal)

🎯 Goal received: Research the current Naples, Florida real estate market,
analyze whether it's a good time for investors to buy, and write a concise
200-word investment brief for a potential buyer.

--- Turn 1 ---
Stop reason: tool_use
  🔧 Calling tool: researcher_agent
     Input: {
  "task": "Research the current Naples, Florida real estate market including
  median home prices, inventory levels, days on market, and recent trends."
}
     Output preview: ## Naples, FL Real Estate Market — Research Summary

**Median Home Price:** ~$725,000 (Q1 2026)
**Active Listings:** Up 18% YoY, signaling improved inventory...

--- Turn 2 ---
Stop reason: tool_use
  🔧 Calling tool: analyst_agent
     Input: {
  "task": "Analyze these Naples FL market findings and assess whether conditions
  favor real estate investors in 2026: [research findings from turn 1]"
}
     Output preview: ## Investment Analysis — Naples FL (2026)

**Key Trends:** Inventory normalization after 2021-2023 seller's market peak.
Luxury segment ($1M+) showing 22% more days on market...

--- Turn 3 ---
Stop reason: tool_use
  🔧 Calling tool: writer_agent
     Input: {
  "task": "Write a 200-word investment brief for a potential real estate buyer
  in Naples, FL using this research and analysis: [combined context]"
}
     Output preview: **Naples, FL Real Estate Investment Brief — May 2026**

If you've been waiting for a better entry point into the Naples market, 2026
may be your window...

--- Turn 4 ---
Stop reason: end_turn

✅ Orchestrator complete after 4 turn(s).

============================================================
FINAL OUTPUT
============================================================
**Naples, FL Real Estate Investment Brief — May 2026**

If you've been waiting for a better entry point into the Naples market, 2026
may be your window. After the frenzied seller's market of 2021–2023, inventory
has climbed 18% year-over-year, giving buyers more leverage than they've had
in years. Median prices have stabilized near $725,000, with luxury properties
above $1M sitting longer — an opening for negotiation-savvy investors.

The fundamentals still favor long-term ownership. Naples continues to attract
high-net-worth relocations from the Northeast and Midwest, demand for rental
properties in the $3,000–$5,000/month range remains strong, and the area's
limited buildable land keeps supply naturally constrained.

Short-term risk: if interest rates climb above 7.5%, buyer demand could soften
further. Long-term outlook: positive. For investors with a 5-to-10-year
horizon and cash reserves to weather short dips, Naples remains one of
Florida's most resilient markets.

Recommended action: Target mid-range single-family homes ($600K–$900K) in
established neighborhoods like Pelican Bay or East Naples, where rental
yield and appreciation potential are both strong.

How It Works

The Orchestrator sends the goal to Claude with a list of tools — one per sub-agent. Claude reads the tool descriptions and decides which one to call first, writing its delegation decision as a tool_use block in the response. We execute that tool (i.e., run the sub-agent), collect the result, and feed it back to Claude.

Claude then decides what to do next: call another tool, or wrap up with a final answer. This loop repeats until stop_reason is end_turn. The entire conversation history travels with every API call, so Claude always knows what's already been done.

The key insight is that Claude isn't "aware" that it's talking to other AI agents — it just sees tool results coming back. This means you could replace any sub-agent with a database query, a web scraper, or a third-party API, and the Orchestrator wouldn't need to change at all.

Common Errors and Fixes

Error 1: anthropic.BadRequestError: messages: roles must alternate between "user" and "assistant"

Why it happens: You added two "user" or two "assistant" messages in a row to the conversation history.
Fix: After appending Claude's assistant response to self.conversation, always follow it immediately with a user message containing the tool results. Never batch two assistant turns together.

Error 2: anthropic.BadRequestError: tool_use id 'toolu_...' does not match any tool result

Why it happens: You returned tool results without correctly referencing the tool_use_id from Claude's request, or you skipped adding Claude's response to the conversation before sending results.
Fix: In _dispatch_tool, make sure the "tool_use_id" in your result dict exactly matches block.id from Claude's tool_use block. The IDs look like toolu_01XYZ... — copy them exactly, don't generate your own.

Error 3: KeyError: 'ANTHROPIC_API_KEY' or AuthenticationError