How to Build AI Agents with Claude API (Step-by-Step)

What You'll Build

If you've been searching for a real, working example of how to build AI agents — not a toy demo, but something you could actually deploy — you're in the right place. By the end of this tutorial, you'll have a fully functional AI agent built with Python and the Anthropic Claude API that can reason through problems, call custom tools, and loop until it reaches a final answer. We're using claude-sonnet-4-6 and the official Anthropic SDK to keep this production-ready from the start.

Prerequisites

Python 3.9 or higher installed
An Anthropic API key (get one at console.anthropic.com)
Basic familiarity with Python classes and functions
anthropic Python SDK installed (pip install anthropic)
A .env file or environment variable for your API key

📦 Full Source Code
The complete, working code for this agent is built up step by step in the sections below. Every snippet connects to the next — by Step 5, you'll have a single runnable file. Copy each section in order and you'll have a working agent in under 15 minutes.

Step 1: Setting Up Your Claude API Environment

First, install the Anthropic SDK and set up your API key. I keep the key in an environment variable so it never touches the codebase directly.

terminal

pip install anthropic python-dotenv

Create a .env file in your project root with your key:

.env

ANTHROPIC_API_KEY=sk-ant-your-key-here

Now verify the SDK is working with a quick sanity check before we build anything bigger.

test_connection.py

import os
from dotenv import load_dotenv
import anthropic

load_dotenv()

client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=256,
    messages=[{"role": "user", "content": "Say hello in one sentence."}]
)

print(message.content[0].text)

If you see a greeting printed to your terminal, you're connected and ready to build.

Step 2: Creating Your First Agent Class

The agent lives in a class that holds the client, the model name, your tool definitions, and the conversation history. Keeping history inside the class is what lets the agent remember what it already did — that's what separates an agent from a single API call.

agent.py

import os
import json
from dotenv import load_dotenv
import anthropic

load_dotenv()

class ClaudeAgent:
    def __init__(self):
        self.client = anthropic.Anthropic(
            api_key=os.environ.get("ANTHROPIC_API_KEY")
        )
        self.model = "claude-sonnet-4-6"
        self.conversation_history = []
        self.tools = self._define_tools()
        self.system_prompt = (
            "You are a helpful research assistant. "
            "Use the tools available to answer questions accurately. "
            "Always show your reasoning before calling a tool."
        )

    def _define_tools(self):
        # Tool definitions are loaded here — we fill these in Step 3
        return []

    def run(self, user_message: str) -> str:
        # Agent loop lives here — we build this in Step 4
        pass

Nothing runs yet, but the skeleton is right. The conversation_history list will grow with each turn, giving Claude full context of everything that's happened in the session.

Step 3: Defining Agent Tools and Functions

Tools are what turn a chatbot into an agent. Claude reads the tool schema, decides when to use a tool, and returns a structured call you can execute on your end. Think of tools as the hands — Claude is the brain directing them.

For this tutorial, we're giving the agent three tools: a calculator, a web search simulator, and a text summarizer. These cover the most common patterns you'll replicate in real projects.

agent.py — updated _define_tools and tool handler

import os
import json
import math
from dotenv import load_dotenv
import anthropic

load_dotenv()

class ClaudeAgent:
    def __init__(self):
        self.client = anthropic.Anthropic(
            api_key=os.environ.get("ANTHROPIC_API_KEY")
        )
        self.model = "claude-sonnet-4-6"
        self.conversation_history = []
        self.tools = self._define_tools()
        self.system_prompt = (
            "You are a helpful research assistant. "
            "Use the tools available to answer questions accurately. "
            "Always show your reasoning before calling a tool."
        )

    def _define_tools(self):
        return [
            {
                "name": "calculator",
                "description": (
                    "Performs mathematical calculations. "
                    "Accepts a valid Python math expression as a string."
                ),
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "expression": {
                            "type": "string",
                            "description": "A math expression, e.g. '(15 * 4) / 2 + math.sqrt(16)'"
                        }
                    },
                    "required": ["expression"]
                }
            },
            {
                "name": "search",
                "description": (
                    "Simulates a web search and returns a summary result. "
                    "Use this when the user asks for current facts or information."
                ),
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query string"
                        }
                    },
                    "required": ["query"]
                }
            },
            {
                "name": "summarize",
                "description": "Returns a brief summary of a given block of text.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "text": {
                            "type": "string",
                            "description": "The text content to summarize"
                        }
                    },
                    "required": ["text"]
                }
            }
        ]

    def _execute_tool(self, tool_name: str, tool_input: dict) -> str:
        """Routes tool calls to the correct Python function."""
        if tool_name == "calculator":
            try:
                # Only allow math expressions — never eval arbitrary user input
                allowed = {k: v for k, v in math.__dict__.items() if not k.startswith("__")}
                result = eval(tool_input["expression"], {"__builtins__": {}}, allowed)
                return str(result)
            except Exception as e:
                return f"Calculator error: {str(e)}"

        elif tool_name == "search":
            # Simulated search — swap this for a real API like Brave or SerpAPI
            query = tool_input["query"].lower()
            simulated_results = {
                "naples florida": "Naples, FL is a city on Florida's Gulf Coast known for its beaches, high median income, and growing tech scene.",
                "claude api": "The Claude API by Anthropic gives developers access to Claude models for building AI-powered applications.",
                "ai agent": "An AI agent is a program that perceives its environment, makes decisions, and takes actions to achieve a goal.",
            }
            for key, value in simulated_results.items():
                if key in query:
                    return value
            return f"Search result for '{tool_input['query']}': No simulated result found. Integrate a real search API for live data."

        elif tool_name == "summarize":
            text = tool_input["text"]
            # Return first two sentences as a naive summary
            sentences = text.split(". ")
            summary = ". ".join(sentences[:2])
            return f"Summary: {summary}."

        else:
            return f"Unknown tool: {tool_name}"

    def run(self, user_message: str) -> str:
        # Agent loop — implemented in Step 4
        pass

⚠️ Security note on eval()
The calculator uses eval() with a restricted namespace so only math functions are accessible. Never call eval() on raw user input without sandboxing it — that's how you get remote code execution vulnerabilities.

Step 4: Implementing the Agent Loop

This is the core of the whole thing. The agent loop sends a message to Claude, checks if Claude wants to use a tool, executes that tool, sends the result back, and repeats — until Claude returns a final text answer with no more tool calls.

Most beginners forget that Claude can call multiple tools in sequence before it's done. The loop handles all of that automatically.

agent.py — complete file with run loop

import os
import json
import math
from dotenv import load_dotenv
import anthropic

load_dotenv()

class ClaudeAgent:
    def __init__(self):
        self.client = anthropic.Anthropic(
            api_key=os.environ.get("ANTHROPIC_API_KEY")
        )
        self.model = "claude-sonnet-4-6"
        self.conversation_history = []
        self.tools = self._define_tools()
        self.system_prompt = (
            "You are a helpful research assistant. "
            "Use the tools available to answer questions accurately. "
            "Always show your reasoning before calling a tool."
        )

    def _define_tools(self):
        return [
            {
                "name": "calculator",
                "description": (
                    "Performs mathematical calculations. "
                    "Accepts a valid Python math expression as a string."
                ),
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "expression": {
                            "type": "string",
                            "description": "A math expression, e.g. '(15 * 4) / 2 + math.sqrt(16)'"
                        }
                    },
                    "required": ["expression"]
                }
            },
            {
                "name": "search",
                "description": (
                    "Simulates a web search and returns a summary result. "
                    "Use this when the user asks for current facts or information."
                ),
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query string"
                        }
                    },
                    "required": ["query"]
                }
            },
            {
                "name": "summarize",
                "description": "Returns a brief summary of a given block of text.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "text": {
                            "type": "string",
                            "description": "The text content to summarize"
                        }
                    },
                    "required": ["text"]
                }
            }
        ]

    def _execute_tool(self, tool_name: str, tool_input: dict) -> str:
        """Routes tool calls to the correct Python function."""
        if tool_name == "calculator":
            try:
                allowed = {k: v for k, v in math.__dict__.items() if not k.startswith("__")}
                result = eval(tool_input["expression"], {"__builtins__": {}}, allowed)
                return str(result)
            except Exception as e:
                return f"Calculator error: {str(e)}"

        elif tool_name == "search":
            query = tool_input["query"].lower()
            simulated_results = {
                "naples florida": "Naples, FL is a city on Florida's Gulf Coast known for its beaches, high median income, and growing tech scene.",
                "claude api": "The Claude API by Anthropic gives developers access to Claude models for building AI-powered applications.",
                "ai agent": "An AI agent is a program that perceives its environment, makes decisions, and takes actions to achieve a goal.",
            }
            for key, value in simulated_results.items():
                if key in query:
                    return value
            return f"Search result for '{tool_input['query']}': No simulated result found. Integrate a real search API for live data."

        elif tool_name == "summarize":
            text = tool_input["text"]
            sentences = text.split(". ")
            summary = ". ".join(sentences[:2])
            return f"Summary: {summary}."

        else:
            return f"Unknown tool: {tool_name}"

    def run(self, user_message: str) -> str:
        """
        Main agent loop. Sends the user message, handles tool calls,
        and returns the final text response from Claude.
        """
        # Add the user message to conversation history
        self.conversation_history.append({
            "role": "user",
            "content": user_message
        })

        print(f"\n{'='*50}")
        print(f"User: {user_message}")
        print(f"{'='*50}")

        # Loop until Claude stops calling tools
        while True:
            response = self.client.messages.create(
                model=self.model,
                max_tokens=4096,
                system=self.system_prompt,
                tools=self.tools,
                messages=self.conversation_history
            )

            # Collect all content blocks from this response
            assistant_content = response.content

            # Always add the assistant's full response to history
            self.conversation_history.append({
                "role": "assistant",
                "content": assistant_content
            })

            # If Claude is done (no tool calls), return the final text
            if response.stop_reason == "end_turn":
                for block in assistant_content:
                    if hasattr(block, "text"):
                        print(f"\nAgent: {block.text}")
                        return block.text
                return ""

            # If Claude wants to use tools, execute each one
            if response.stop_reason == "tool_use":
                tool_results = []

                for block in assistant_content:
                    if block.type == "tool_use":
                        tool_name = block.name
                        tool_input = block.input
                        tool_use_id = block.id

                        print(f"\n[Tool Call] {tool_name}({json.dumps(tool_input)})")

                        # Execute the tool and capture the result
                        result = self._execute_tool(tool_name, tool_input)
                        print(f"[Tool Result] {result}")

                        tool_results.append({
                            "type": "tool_result",
                            "tool_use_id": tool_use_id,
                            "content": result
                        })

                # Send all tool results back to Claude in one user turn
                self.conversation_history.append({
                    "role": "user",
                    "content": tool_results
                })

            else:
                # Unexpected stop reason — break to avoid infinite loop
                print(f"Unexpected stop reason: {response.stop_reason}")
                break

        return "Agent loop ended unexpectedly."


if __name__ == "__main__":
    agent = ClaudeAgent()

    # Test 1: Math calculation
    agent.run("What is 15% of 2,340 plus the square root of 144?")

    # Reset history for a fresh conversation
    agent.conversation_history = []

    # Test 2: Search + summarize chain
    agent.run("Search for what an AI agent is, then summarize the result.")

Step 5: Testing and Iterating Your Agent

Run the file directly and watch the agent think out loud. You'll see each tool call printed in real time, which makes debugging a lot easier than staring at a silent response object.

terminal

python agent.py

Here's what realistic output looks like for the two test cases:

example output

==================================================
User: What is 15% of 2,340 plus the square root of 144?
==================================================

[Tool Call] calculator({"expression": "2340 * 0.15 + math.sqrt(144)"})
[Tool Result] 363.0

Agent: 15% of 2,340 is 351, and the square root of 144 is 12.
Adding those together gives you 363.

==================================================
User: Search for what an AI agent is, then summarize the result.
==================================================

[Tool Call] search({"query": "ai agent"})
[Tool Result] An AI agent is a program that perceives its environment, makes decisions, and takes actions to achieve a goal.

[Tool Call] summarize({"text": "An AI agent is a program that perceives its environment, makes decisions, and takes actions to achieve a goal."})
[Tool Result] Summary: An AI agent is a program that perceives its environment, makes decisions, and takes actions to achieve a goal..

Agent: Here's what I found: An AI agent is a program that perceives its
environment, makes decisions, and takes actions to achieve a goal.
That's the core idea — perception, decision-making, and action in a loop.

If both queries produce clean output like this, your agent is working correctly. Try swapping in your own tool functions — database queries, real API calls, file readers — and the loop handles them the same way.

How It Works

Here's the plain-English version of what's happening under the hood. You send Claude a message along with a list of tool schemas. Claude reads the schemas, decides if it needs a tool, and if so returns a tool_use block instead of a text answer.

Your code catches that block, runs the actual Python function, and sends the result back to Claude as a tool_result. Claude then decides whether to call another tool or write its final answer. This loop keeps going until Claude returns stop_reason: "end_turn".

The conversation history is the secret ingredient. Every turn — user messages, assistant responses, tool results — gets appended to the same list. That's how Claude knows what it already tried and what it learned from each tool call.

Common Errors and Fixes

Error 1: AuthenticationError — Invalid API Key

anthropic.AuthenticationError: 401 {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"}}

Fix: Your .env file isn't loading, or the key itself is wrong. Make sure load_dotenv() is called before os.environ.get(), and double-check there are no quotes around your key value in the .env file. The line should look exactly like ANTHROPIC_API_KEY=sk-ant-abc123 with no quotes.

Error 2: Infinite Loop — Agent Never Returns

# No error message — the script just runs forever and never prints "Agent:"

Fix: This usually means tool results aren't being added to conversation_history correctly. Claude keeps calling tools because it never sees the results. Check that your tool_results list is appended as a "user" role message with "type": "tool_result" and the correct tool_use_id matching what Claude sent.

Error 3: ValidationError — Wrong Tool Schema Format

anthropic.BadRequestError: 400 {"type":"error","error":{"type":"invalid_request_error","message":"tools.0.input_schema: value is not a valid dict"}}

Fix: Your tool's input_schema must include "type": "object" at the top level and a "properties" key — even if the tool takes no inputs, include an empty properties dict. The schema follows JSON Schema spec, so every property needs both a "type" and a "description" field.

Next Steps

Connect a real search API. Replace the simulated search with Brave Search API or SerpAPI to give your agent live internet access. The tool interface stays identical — you just change what's inside _execute_tool.
Add memory persistence. Save conversation_history to a JSON file or a database between sessions. That's all it takes to give your agent long-term memory across restarts.
Build a multi-agent system. Create two ClaudeAgent instances — one as a planner and one as an executor — and have them pass messages to each other. This is how more sophisticated pipelines like research