How to Build an AI Agent with Claude API (Step-by-Step)

What You'll Build

You're going to build a fully working AI agent in Python using the Claude API — one that can reason through a problem, decide which tools to use, call those tools, and loop until it has a real answer. This isn't a chatbot that responds to a single message. It's an autonomous agent that can chain multiple steps together without you holding its hand.

By the end of this tutorial, you'll have a working agent that can look up weather data and do math — two simple tools that demonstrate the full pattern you'd use for anything more complex, like querying a database or calling a third-party API.

📦 Full Source Code
The complete, working code for this tutorial is broken into steps below. Each snippet builds on the last. By Step 4, you'll have everything you need to run the agent end-to-end. No placeholder functions, no pseudocode — just code that actually runs.

Prerequisites

Python 3.9 or higher installed
An Anthropic API key (get one at console.anthropic.com)
Basic familiarity with Python classes and functions
anthropic package installed (pip install anthropic)
A terminal and a code editor you're comfortable with

Step 1: Set Up the Claude SDK and Authentication

First, install the Anthropic SDK if you haven't already. Open your terminal and run pip install anthropic. Then set your API key as an environment variable so you're not hardcoding secrets into your source files.

On Mac or Linux, run export ANTHROPIC_API_KEY="your-key-here" in your terminal. On Windows, use set ANTHROPIC_API_KEY=your-key-here. The SDK will pick it up automatically.

setup_check.py

import os
import anthropic

# Verify the SDK is installed and the key is loaded
api_key = os.environ.get("ANTHROPIC_API_KEY")
if not api_key:
    raise EnvironmentError("ANTHROPIC_API_KEY environment variable is not set.")

client = anthropic.Anthropic(api_key=api_key)

# Quick sanity check — send a minimal message
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=64,
    messages=[{"role": "user", "content": "Say hello in one word."}]
)

print(response.content[0].text)
# Expected output: Hello

If that prints something without throwing an error, you're good to go. The anthropic.Anthropic() client is what you'll use for every API call throughout this tutorial.

Step 2: Define Tools and Tool Schemas

Tools are how Claude reaches outside its own knowledge. You define a tool by giving it a name, a description, and a JSON schema that describes its input parameters. Claude reads those descriptions and decides when and how to use each tool.

The description is not a formality — it's the main signal Claude uses to pick the right tool. Write it like you're explaining the function to a smart colleague who can't see your code.

tools.py

import json
import math

# Tool schemas tell Claude what each tool does and what inputs it expects.
# These get passed directly into the API call.
TOOL_DEFINITIONS = [
    {
        "name": "get_weather",
        "description": (
            "Returns the current weather for a given city. "
            "Use this when the user asks about weather conditions in a specific location."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The name of the city to get weather for, e.g. 'Naples, FL'"
                }
            },
            "required": ["city"]
        }
    },
    {
        "name": "calculate",
        "description": (
            "Evaluates a mathematical expression and returns the numeric result. "
            "Use this for any arithmetic, percentages, or unit conversions."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "A valid Python math expression, e.g. '(98.6 - 32) * 5/9'"
                }
            },
            "required": ["expression"]
        }
    }
]


def get_weather(city: str) -> str:
    """
    Simulated weather lookup. In production, replace this with a
    real API call to OpenWeatherMap, WeatherAPI, or similar.
    """
    mock_data = {
        "naples, fl": {"temp_f": 84, "condition": "Sunny", "humidity": 72},
        "miami, fl": {"temp_f": 88, "condition": "Partly Cloudy", "humidity": 78},
        "new york, ny": {"temp_f": 55, "condition": "Overcast", "humidity": 60},
    }
    key = city.lower().strip()
    weather = mock_data.get(key, {"temp_f": 70, "condition": "Clear", "humidity": 50})
    return json.dumps({
        "city": city,
        "temperature_f": weather["temp_f"],
        "condition": weather["condition"],
        "humidity_percent": weather["humidity"]
    })


def calculate(expression: str) -> str:
    """
    Safely evaluates a math expression using Python's math module.
    Only allows numeric operations — no builtins, no exec tricks.
    """
    try:
        # Restrict eval to math functions only for safety
        allowed_names = {k: v for k, v in math.__dict__.items() if not k.startswith("_")}
        result = eval(expression, {"__builtins__": {}}, allowed_names)
        return json.dumps({"result": result, "expression": expression})
    except Exception as e:
        return json.dumps({"error": str(e), "expression": expression})


def execute_tool(tool_name: str, tool_input: dict) -> str:
    """Routes a tool call from Claude to the correct Python function."""
    if tool_name == "get_weather":
        return get_weather(tool_input["city"])
    elif tool_name == "calculate":
        return calculate(tool_input["expression"])
    else:
        return json.dumps({"error": f"Unknown tool: {tool_name}"})

Notice that execute_tool is the single routing function — Claude tells you which tool it wants, and this function dispatches it. That keeps your agentic loop clean and easy to extend.

Step 3: Implement the Agentic Loop

Here's the core concept: Claude responds to your message, but instead of giving you a final answer right away, it might say "I need to use a tool first." Your job is to catch that, run the tool, feed the result back, and let Claude continue. That cycle — send, receive, act, repeat — is the agentic loop.

The loop keeps running until Claude returns a stop_reason of "end_turn", which means it's done reasoning and has a final answer for you. You set a max iteration limit as a safety net so it can't run forever.

agent.py

import os
import anthropic
from tools import TOOL_DEFINITIONS, execute_tool

class ClaudeAgent:
    """
    A simple autonomous agent built on top of the Claude API.
    Supports multi-step tool use via an agentic loop.
    """

    def __init__(self, max_iterations: int = 10):
        self.client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
        self.model = "claude-sonnet-4-5"
        self.max_iterations = max_iterations
        self.conversation_history = []

    def run(self, user_message: str) -> str:
        """
        Entry point. Takes a user message and returns the agent's final answer
        after completing any necessary tool calls.
        """
        print(f"\n{'='*60}")
        print(f"USER: {user_message}")
        print(f"{'='*60}")

        # Add the user's message to the conversation history
        self.conversation_history.append({
            "role": "user",
            "content": user_message
        })

        # Run the agentic loop
        for iteration in range(self.max_iterations):
            print(f"\n[Iteration {iteration + 1}]")

            response = self.client.messages.create(
                model=self.model,
                max_tokens=4096,
                tools=TOOL_DEFINITIONS,
                messages=self.conversation_history
            )

            print(f"Stop reason: {response.stop_reason}")

            # Append Claude's response to history so the loop stays coherent
            self.conversation_history.append({
                "role": "assistant",
                "content": response.content
            })

            # If Claude is done, extract and return the final text response
            if response.stop_reason == "end_turn":
                final_text = self._extract_text(response.content)
                print(f"\nAGENT FINAL ANSWER:\n{final_text}")
                return final_text

            # If Claude wants to use tools, handle each tool call
            if response.stop_reason == "tool_use":
                tool_results = self._handle_tool_calls(response.content)

                # Feed the tool results back into the conversation
                self.conversation_history.append({
                    "role": "user",
                    "content": tool_results
                })
            else:
                # Unexpected stop reason — break to avoid infinite loop
                print(f"Unexpected stop reason: {response.stop_reason}")
                break

        return "Agent reached maximum iterations without completing the task."

    def _handle_tool_calls(self, content_blocks: list) -> list:
        """
        Finds all tool_use blocks in Claude's response, runs each tool,
        and returns a list of tool_result blocks to send back.
        """
        tool_results = []

        for block in content_blocks:
            if block.type == "tool_use":
                print(f"\n  TOOL CALL: {block.name}")
                print(f"  INPUT: {block.input}")

                # Execute the tool and get the result
                result = execute_tool(block.name, block.input)
                print(f"  RESULT: {result}")

                # Format the result as Claude expects it
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result
                })

        return tool_results

    def _extract_text(self, content_blocks: list) -> str:
        """Pulls the text content out of Claude's final response."""
        texts = []
        for block in content_blocks:
            if hasattr(block, "text"):
                texts.append(block.text)
        return "\n".join(texts)

Step 4: Handle Tool Calls and Run the Agent

Now you wire everything together with a main entry point. This is where you create the agent, give it a task, and let it run. The example below asks a question that requires both tools — getting weather data and then doing a unit conversion — so you can see the full loop in action.

main.py

from agent import ClaudeAgent

def main():
    # Create the agent with a safety limit of 10 loop iterations
    agent = ClaudeAgent(max_iterations=10)

    # This question requires two tool calls: weather lookup + unit conversion
    result = agent.run(
        "What's the weather like in Naples, FL right now? "
        "Also convert the temperature to Celsius for me."
    )

    print("\n" + "="*60)
    print("DONE")
    print("="*60)

if __name__ == "__main__":
    main()

Here's what the actual console output looks like when you run this:

example_output.txt

============================================================
USER: What's the weather like in Naples, FL right now? Also convert the temperature to Celsius for me.
============================================================

[Iteration 1]
Stop reason: tool_use

  TOOL CALL: get_weather
  INPUT: {'city': 'Naples, FL'}
  RESULT: {"city": "Naples, FL", "temperature_f": 84, "condition": "Sunny", "humidity_percent": 72}

[Iteration 2]
Stop reason: tool_use

  TOOL CALL: calculate
  INPUT: {'expression': '(84 - 32) * 5/9'}
  RESULT: {"result": 28.88888888888889, "expression": "(84 - 32) * 5/9"}

[Iteration 3]
Stop reason: end_turn

AGENT FINAL ANSWER:
The current weather in Naples, FL is sunny with a temperature of 84°F (approximately 28.9°C)
and humidity at 72%. It's a warm, beautiful day on the Gulf Coast!

============================================================
DONE
============================================================

Claude made two tool calls on its own — it figured out it needed the weather first, then used that result to do the Celsius conversion. You didn't have to orchestrate any of that. That's the whole point of an agentic loop.

How It Works: Agent Decision Flow

Here's the plain-English version of what's happening under the hood. When you send Claude a message with tool definitions attached, it reads those definitions and decides whether it can answer directly or needs more information. If it needs a tool, it returns a response with stop_reason: "tool_use" instead of "end_turn".

Your loop catches that, runs the tool locally on your machine, and sends the result back to Claude as a tool_result message. Claude then continues reasoning with that new information. It might call another tool, or it might have everything it needs to answer.

The conversation history is the key piece. Every message — user, assistant, and tool result — gets appended to self.conversation_history. That's how Claude maintains context across multiple iterations. Without it, each API call would be stateless and Claude would have no memory of what tools it already called.

💡 Why this pattern scales
This same agentic loop works whether you have 2 tools or 20. Add a database query tool, a web scraper, a CRM lookup — just drop them into TOOL_DEFINITIONS and add a branch in execute_tool(). The loop logic doesn't change at all.

Common Errors and Fixes

Error 1: AuthenticationError — Invalid API Key

Exact error: anthropic.AuthenticationError: 401 {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"}}

This means your API key isn't being read correctly. Double-check that you set the environment variable in the same terminal session where you're running Python. If you set it in a new tab, it won't carry over.

Fix: Run echo $ANTHROPIC_API_KEY (Mac/Linux) or echo %ANTHROPIC_API_KEY% (Windows) to confirm it's actually set. If the output is blank, set it again before running your script.

Error 2: tool_use_id Mismatch Causes API Error

Exact error: anthropic.BadRequestError: 400 {"type":"error","error":{"type":"invalid_request_error","message":"tool_use_id is invalid"}}

This happens when the tool_use_id in your tool_result message doesn't match the ID Claude sent in its tool_use block. Usually it's a copy-paste bug or a manual ID string instead of using block.id.

Fix: Always use block.id directly from the response block. Never hardcode or modify the ID. Check that you're passing "tool_use_id": block.id in your tool result, exactly as shown in the _handle_tool_calls method above.

Error 3: Agent Loops Forever and Hits Max Iterations

Symptom: Your agent prints iteration after iteration and never returns end_turn. Eventually it hits your max_iterations cap and returns the fallback message.

This usually means your tool is returning an error response and Claude keeps retrying, or your tool descriptions are ambiguous and Claude isn't sure when it's done.

Fix: Add print statements in execute_tool to see what's actually being returned. Make sure error responses from tools are valid JSON strings, not Python exceptions that crash the loop silently. Also review your tool descriptions — if Claude can't tell when a tool has given it enough information, it'll keep calling it.

Next Steps

Now that you have a working agent, here are four natural ways to push it further:

Add a real web search tool — connect a Serper or Brave Search API call so your agent can look up live information, not just mock data.
Persist conversation history to a database — right now, history lives in memory and resets when the script ends. Store it in SQLite or PostgreSQL to build multi-session agents.
Add a system prompt — pass a system parameter to client.messages.create() to give your agent a persona, constraints, or domain-specific instructions that shape every response.
Build a tool that queries your own business data — connect your agent to a CRM, inventory system, or internal database and you've just built the foundation of an intelligent process automation system.

Frequently Asked Questions

How do AI agents with Claude API differ from regular Claude chat?

A regular Claude chat call is a single round-trip: you send a message, you get a response. An AI agent using an agentic loop keeps calling the API in a cycle — sending tool results back and letting Claude decide what to do next — until the task is fully complete. It's the difference between asking someone a question and hiring them to actually solve a problem.

What is an agentic loop in the Claude API?

An agentic loop is a while-style cycle where you repeatedly send messages to Claude, check if it wants to use a tool, execute that tool yourself, and feed the result back. The loop continues until Claude returns stop_reason: "end_turn", meaning it has enough information to give you a final answer without any more tool calls.

How many tools can I give Claude at once?

Anthropic doesn't publish a hard cap on tool count, but practically speaking, you'll want to keep your tool list focused. More tools mean more tokens spent on the tool definitions in every API call, and Claude can get confused if too many tools have overlapping purposes. Start with the tools you actually need and add more as the use case demands it.

Can Claude agents run in production without human oversight?

Yes, but you need guardrails. Set a max_iterations limit, validate all tool inputs before executing them, log every tool call, and wrap your loop in a try/except so one bad API response doesn't crash your whole system. For anything touching real data — databases, email, payments — build a confirmation step or an approval queue before the tool fires.

How do I add memory to a Claude AI agent?

The simplest approach is to serialize self.conversation_history to a JSON file or database row after each session, keyed to a user ID or session ID. When a new session starts, load that history back and prepend it to the messages list. For longer-running agents, you'll want a vector database like Pinecone or pgvector to store and retrieve semantically relevant memories rather than the full raw history.

Conclusion

You now have a fully working AI agent built on Claude's API — one that can reason across multiple steps, call tools, and loop until it gets to a real answer. That's the core architecture behind everything from intelligent chatbots to automated research workflows to multi-step business process automation. If you're a business in Southwest Florida trying to figure out how to put something like this to work — whether that's automating repetitive tasks, building a smarter customer-facing assistant, or connecting AI to your existing data — that's exactly what we do at Naples AI. We build custom AI solutions for real businesses with real workflows, and we'd love to talk through what's possible for yours. Book a free 30-minute call with Chris here and let's figure out what makes sense for your operation.