How to Build AI Agents with Claude API (Step-by-Step)

What You'll Build

If you've been searching for a practical how to build AI agents tutorial, you're in the right place. By the end of this guide, you'll have a fully working AI agent in Python that can reason through problems, call real tools, and loop until it gets an answer — not just a chatbot that echoes responses.

We're building a research agent that can search for information, do math calculations, and check the weather — all driven by Claude's tool use API. It's the same core pattern we use at Naples AI when building custom agents for real estate, healthcare, and local business clients.

📦 Full Source Code Note
The complete, working code is broken into steps below so you understand exactly what each piece does. Copy each step in order and you'll have a running agent by the end. No placeholders, no pseudocode — everything here actually executes.

Prerequisites

Python 3.9 or higher installed
An Anthropic API key (get one at console.anthropic.com)
Basic familiarity with Python classes and functions
pip installed and working
A terminal / command prompt you're comfortable using

Step 1: Set Up Your Python Environment and Anthropic SDK

First, create a fresh project folder and a virtual environment. This keeps your dependencies clean and avoids conflicts with other projects on your machine.

Run these commands in your terminal:

terminal

mkdir claude-agent
cd claude-agent
python -m venv venv

# Activate on Mac/Linux:
source venv/bin/activate

# Activate on Windows:
venv\Scripts\activate

pip install anthropic python-dotenv

Now create a .env file in your project root to store your API key safely. Never hardcode keys directly in your Python files.

.env

ANTHROPIC_API_KEY=sk-ant-your-key-here

That's all the setup you need. The anthropic package gives us the SDK, and python-dotenv loads our key from the file automatically.

Step 2: Create Your First AI Agent Class

The agent class is the backbone of everything. It holds the Anthropic client, tracks the conversation history, and knows which tools it has access to. Here's the starting structure:

agent.py

import os
import json
import anthropic
from dotenv import load_dotenv

load_dotenv()

class ClaudeAgent:
    def __init__(self):
        # Initialize the Anthropic client using the API key from .env
        self.client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
        self.model = "claude-sonnet-4-6"
        self.conversation_history = []
        self.tools = self._define_tools()
        self.max_iterations = 10  # Safety cap to prevent infinite loops

    def _define_tools(self):
        """Returns the list of tools Claude can call during a run."""
        return [
            {
                "name": "calculate",
                "description": "Performs basic arithmetic. Use this for any math operations.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "expression": {
                            "type": "string",
                            "description": "A math expression to evaluate, e.g. '(42 * 3) + 10'"
                        }
                    },
                    "required": ["expression"]
                }
            },
            {
                "name": "search_web",
                "description": "Simulates a web search and returns relevant information on a topic.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query to look up"
                        }
                    },
                    "required": ["query"]
                }
            },
            {
                "name": "get_weather",
                "description": "Returns the current weather for a given city.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "The city name to get weather for, e.g. 'Naples, FL'"
                        }
                    },
                    "required": ["city"]
                }
            }
        ]

The _define_tools method returns JSON schema definitions that tell Claude exactly what each tool does and what arguments it expects. Claude reads these descriptions and decides on its own when to call them — you don't have to tell it when to use a tool.

💡 Tip: Write good tool descriptions
Claude decides which tool to call based entirely on the description field. Be specific about what the tool does and when to use it. Vague descriptions lead to wrong tool calls.

Step 3: Define Tool Functions for Agent Actions

Now we need the actual Python functions that run when Claude calls a tool. Add these methods inside the same ClaudeAgent class, right after _define_tools:

agent.py (continued — add these methods to ClaudeAgent)

    def _execute_tool(self, tool_name: str, tool_input: dict) -> str:
        """Routes a tool call to the correct function and returns the result as a string."""
        if tool_name == "calculate":
            return self._calculate(tool_input["expression"])
        elif tool_name == "search_web":
            return self._search_web(tool_input["query"])
        elif tool_name == "get_weather":
            return self._get_weather(tool_input["city"])
        else:
            return f"Error: Unknown tool '{tool_name}'"

    def _calculate(self, expression: str) -> str:
        """Safely evaluates a math expression using Python's eval with restricted globals."""
        try:
            # Restrict eval to math operations only — no builtins, no imports
            allowed_names = {"__builtins__": {}}
            result = eval(expression, allowed_names)
            return f"Result: {result}"
        except Exception as e:
            return f"Calculation error: {str(e)}"

    def _search_web(self, query: str) -> str:
        """Simulates a web search. In production, replace this with a real search API."""
        mock_results = {
            "naples florida": "Naples, FL is a city on the Gulf Coast of Southwest Florida. Population ~22,000. Known for white sand beaches, upscale dining, and a thriving real estate market.",
            "anthropic claude": "Anthropic is an AI safety company founded in 2021. Claude is their AI assistant, known for being helpful, harmless, and honest. Claude supports tool use and multi-agent workflows.",
            "ai agents": "AI agents are systems that perceive their environment, make decisions, and take actions to achieve goals. They often use tool calling to interact with external systems and APIs.",
        }
        query_lower = query.lower()
        for key, result in mock_results.items():
            if any(word in query_lower for word in key.split()):
                return result
        return f"Search results for '{query}': Found general information. AI agents are increasingly used in business automation, customer service, and data analysis workflows."

    def _get_weather(self, city: str) -> str:
        """Returns mock weather data. Replace with a real weather API like OpenWeatherMap in production."""
        weather_data = {
            "naples": "Naples, FL: 84°F, Sunny, Humidity 72%, Wind 8 mph SE. Perfect beach weather.",
            "miami": "Miami, FL: 88°F, Partly Cloudy, Humidity 78%, Wind 12 mph E.",
            "new york": "New York, NY: 67°F, Overcast, Humidity 55%, Wind 15 mph NW.",
        }
        city_lower = city.lower()
        for key, data in weather_data.items():
            if key in city_lower:
                return data
        return f"Weather for {city}: 75°F, Clear skies, Humidity 60%, Wind 10 mph."

These are mock functions for the tutorial, but swapping them out for real APIs is straightforward. The interface stays the same — just replace the function body with a real HTTP call to a weather service or search API.

Step 4: Implement the Agent Loop with Tool Use

This is where everything comes together. The agent loop sends a message to Claude, checks if Claude wants to use a tool, runs that tool, feeds the result back, and keeps going until Claude gives a final answer.

Add the run method to your ClaudeAgent class:

agent.py (continued — add this method to ClaudeAgent)

    def run(self, user_message: str) -> str:
        """
        Main agent loop. Sends user_message to Claude, handles tool calls,
        and returns Claude's final text response.
        """
        print(f"\n🧠 Agent received: {user_message}")

        # Add the user's message to the running conversation history
        self.conversation_history.append({
            "role": "user",
            "content": user_message
        })

        iteration = 0

        while iteration < self.max_iterations:
            iteration += 1
            print(f"   [Iteration {iteration}] Calling Claude...")

            # Send the full conversation history plus tool definitions to Claude
            response = self.client.messages.create(
                model=self.model,
                max_tokens=4096,
                tools=self.tools,
                messages=self.conversation_history
            )

            print(f"   [Iteration {iteration}] Stop reason: {response.stop_reason}")

            # If Claude is done reasoning and has a final answer, return it
            if response.stop_reason == "end_turn":
                final_text = ""
                for block in response.content:
                    if hasattr(block, "text"):
                        final_text += block.text
                # Add Claude's final response to history for future turns
                self.conversation_history.append({
                    "role": "assistant",
                    "content": response.content
                })
                return final_text

            # If Claude wants to use tools, handle all tool calls in this turn
            if response.stop_reason == "tool_use":
                # Add Claude's response (which includes the tool_use blocks) to history
                self.conversation_history.append({
                    "role": "assistant",
                    "content": response.content
                })

                # Collect results for all tool calls Claude made in this response
                tool_results = []
                for block in response.content:
                    if block.type == "tool_use":
                        print(f"   🔧 Tool call: {block.name}({json.dumps(block.input)})")
                        result = self._execute_tool(block.name, block.input)
                        print(f"   ✅ Tool result: {result}")

                        tool_results.append({
                            "type": "tool_result",
                            "tool_use_id": block.id,  # Must match the id Claude assigned
                            "content": result
                        })

                # Feed all tool results back to Claude as a single user message
                self.conversation_history.append({
                    "role": "user",
                    "content": tool_results
                })

            else:
                # Unexpected stop reason — break out to avoid getting stuck
                print(f"   ⚠️ Unexpected stop reason: {response.stop_reason}")
                break

        return "Agent reached maximum iterations without completing the task."

The key thing to understand here is the tool_use_id field. Claude assigns a unique ID to every tool call, and you must echo that same ID back when you return the result. That's how Claude knows which result matches which call when it makes multiple tool calls at once.

⚠️ Important: Always return tool results in the same turn
When stop_reason is "tool_use", Claude is waiting for results before it can continue. You must always add a user message with tool_result blocks back before calling the API again. Skipping this causes a validation error.

Step 5: Test Your Agent with Real Queries

Now let's wire everything up with a main block and actually run it. Create a separate file to keep things clean:

main.py

import os
from agent import ClaudeAgent

def main():
    agent = ClaudeAgent()

    # Test 1: A question that requires the search tool
    print("=" * 60)
    result = agent.run("What can you tell me about Naples, Florida?")
    print(f"\n📋 Final Answer:\n{result}")

    # Reset history between separate tasks
    agent.conversation_history = []

    # Test 2: A multi-step query that needs both math and weather tools
    print("\n" + "=" * 60)
    result = agent.run(
        "What's the weather like in Naples FL? Also, if I have 47 beachfront listings "
        "and sell 30% of them this quarter, how many does that leave me with?"
    )
    print(f"\n📋 Final Answer:\n{result}")

    # Test 3: A pure math query
    agent.conversation_history = []
    print("\n" + "=" * 60)
    result = agent.run("Calculate (850 * 12) + (325 * 4) - 1500 for me.")
    print(f"\n📋 Final Answer:\n{result}")

if __name__ == "__main__":
    main()

Run it with python main.py and you should see the agent thinking through each step in real time.

Example Output

Here's what the terminal output looks like when you run the second test query:

example output

============================================================

🧠 Agent received: What's the weather like in Naples FL? Also, if I have 47 beachfront listings and sell 30% of them this quarter, how many does that leave me with?
   [Iteration 1] Calling Claude...
   [Iteration 1] Stop reason: tool_use
   🔧 Tool call: get_weather({"city": "Naples, FL"})
   ✅ Tool result: Naples, FL: 84°F, Sunny, Humidity 72%, Wind 8 mph SE. Perfect beach weather.
   🔧 Tool call: calculate({"expression": "47 - (47 * 0.30)"})
   ✅ Tool result: Result: 32.9
   [Iteration 2] Calling Claude...
   [Iteration 2] Stop reason: end_turn

📋 Final Answer:
Here's what you asked for:

**Weather in Naples, FL:** It's a beautiful day — 84°F and sunny with 72% humidity 
and a light 8 mph southeast breeze. Classic Southwest Florida beach weather.

**Listings calculation:** If you start with 47 beachfront listings and sell 30% 
(which is about 14 listings), you'd have roughly 33 listings remaining at the end 
of the quarter.

Let me know if you need anything else!

Notice that Claude called both tools in a single iteration — it figured out it needed weather data and a calculation at the same time, made both calls in parallel, and then wrote the final answer after getting both results back.

How It Works

The core idea is a while loop that keeps talking to Claude until the work is done. Every time Claude responds, you check the stop_reason field — if it's "end_turn", Claude is finished and you return the text. If it's "tool_use", Claude needs to run one or more tools before it can continue.

The conversation history is the agent's memory. Every message — user input, Claude responses, and tool results — gets appended to the same list. Claude sees the full history on every API call, which is how it knows what it's already tried and what the results were.

The max_iterations cap is your safety net. Without it, a confused agent could loop forever and burn through API credits. Ten iterations is more than enough for most real-world tasks.

Common Errors and Fixes

Error 1: Invalid API Key

anthropic.AuthenticationError: Error code: 401 - {'type': 'error', 'error': {'type': 'authentication_error', 'message': 'invalid x-api-key'}}

Fix: Check that your .env file is in the same directory you're running the script from, that it starts with ANTHROPIC_API_KEY= (no spaces around the equals sign), and that you called load_dotenv() before os.getenv(). Print os.getenv("ANTHROPIC_API_KEY") to confirm it's loading.

Error 2: Missing Tool Result in Conversation

anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'messages: tool_use block "toolu_01..." does not have a corresponding tool_result block'}}

Fix: This happens when you add Claude's tool_use response to history but forget to add the tool_result message back. Every tool_use block must have a matching tool_result with the same tool_use_id. Check that your loop adds the results list before calling the API again.

Error 3: eval() Security Risk / NameError in Calculate

NameError: name 'import' is not defined
# or
Calculation error: name 'os' is not defined

Fix: The restricted eval in the tutorial intentionally blocks access to builtins and imports — that's by design. If you need more advanced math, use the math module and add it to the allowed names dict: allowed_names = {"__builtins__": {}, "math": math}. For production systems, consider using a dedicated math parsing library like numexpr instead of eval entirely.

Next Steps

You've got a working agent — here's where to take it next:

Add real tools: Swap the mock search function for the Brave Search API or SerpAPI. Replace the weather mock with OpenWeatherMap. These are both free-tier friendly and work with the exact same interface.
Build a multi-agent system: Create a second agent with a different specialty (a writer agent, a data analyst agent) and have your first agent delegate tasks to it by calling it as a tool. This is the Claude multi-agent systems pattern.
Add persistent memory: Right now the conversation history lives in RAM. Store it in a database (SQLite or PostgreSQL) so your agent remembers past sessions and builds context over time.
Deploy it as an API: Wrap the agent in a FastAPI endpoint so other applications can call it via HTTP. This is the first step toward integrating it into a real product or business workflow.

Frequently Asked Questions

How do I add more tools to a Claude AI agent?

Add a new dictionary to the list returned by _define_tools() with a name, description, and input_schema. Then add a matching elif branch in _execute_tool() that calls your new function. Claude will automatically learn to use it based on the description — you don't need to change the loop logic at all.

What is the difference between Claude tool use and function calling in OpenAI?

They solve the same problem but use different terminology. OpenAI calls it "function calling" while Anthropic calls it "tool use." The concepts are nearly identical — the model returns a structured request to run a function, you run it, and you feed the result back. The Anthropic implementation tends to be more explicit about parallel tool calls, which Claude handles well out of the box.

How do I prevent an AI agent loop from running forever?

Set a max_iterations counter and break out of the while loop when you hit it. Return a clear error message so you know why it stopped. In production, also add a timeout on your API calls using the timeout parameter in the Anthropic client constructor — something like anthropic.Anthropic(timeout=30.0) keeps runaway requests from hanging your server