How to Build a Customer Support AI Agent with Claude API (Step-by-Step)

What You'll Build

By the end of this tutorial, you'll have a fully working customer support AI agent built on the Claude API that can create support tickets, search a knowledge base, and escalate issues to a human agent — all from a single conversation loop. This isn't a toy demo. It's a production-ready pattern you can actually drop into a real business.

We're using Python and the official Anthropic SDK, and every line of code here runs. I built this same architecture for clients here in Southwest Florida, and I'm sharing exactly what worked.

📦 Full Source Code
The complete, working source code for this agent is built out step by step in the sections below. Each snippet builds on the last — by Step 5, you'll have the entire agent assembled and tested with real support scenarios.

Prerequisites

Python 3.9 or higher installed
An Anthropic API key (get one at console.anthropic.com)
Basic familiarity with Python classes and functions
anthropic Python SDK installed (pip install anthropic)
A .env file or environment variable set for ANTHROPIC_API_KEY

Step 1: Set Up Your Claude API Environment

First, install the dependencies and get your client initialized. I always keep the client setup isolated in its own class so it's easy to swap or mock in tests later.

Create a new file called support_agent.py and start with this:

support_agent.py

import os
import json
import uuid
from datetime import datetime
from anthropic import Anthropic

# Initialize the Anthropic client using your API key from the environment
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

MODEL = "claude-sonnet-4-6"

# Quick sanity check — make sure the key is actually loaded
if not os.environ.get("ANTHROPIC_API_KEY"):
    raise EnvironmentError(
        "ANTHROPIC_API_KEY is not set. Export it before running: "
        "export ANTHROPIC_API_KEY='your-key-here'"
    )

print("✓ Claude client initialized successfully")

Run this file on its own first. If you see the checkmark, you're connected and ready to build.

💡 Tip: Never hardcode your API key in source files. Use python-dotenv and a .env file locally, or inject it via environment variables in production. One accidental git push and that key is compromised.

Step 2: Define Support Agent Tools

This is where it gets interesting. Claude's tool use feature lets the model call functions in your code — it decides when to use them based on the conversation. We're defining three tools: create_ticket, search_knowledge_base, and escalate_to_human.

Add this to your support_agent.py file:

support_agent.py (tool definitions)

TOOLS = [
    {
        "name": "create_ticket",
        "description": (
            "Creates a new support ticket when a customer reports an issue "
            "that requires tracking or follow-up. Use this when the issue "
            "cannot be resolved immediately in the conversation."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "subject": {
                    "type": "string",
                    "description": "A short, descriptive title for the ticket"
                },
                "description": {
                    "type": "string",
                    "description": "Full description of the customer's issue"
                },
                "priority": {
                    "type": "string",
                    "enum": ["low", "medium", "high", "urgent"],
                    "description": "Priority level based on impact and urgency"
                },
                "category": {
                    "type": "string",
                    "enum": ["billing", "technical", "shipping", "account", "general"],
                    "description": "The category that best matches the issue"
                }
            },
            "required": ["subject", "description", "priority", "category"]
        }
    },
    {
        "name": "search_knowledge_base",
        "description": (
            "Searches the internal knowledge base for articles, FAQs, and "
            "documented solutions. Use this before creating a ticket — the "
            "answer might already exist."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The search query to look up in the knowledge base"
                },
                "category": {
                    "type": "string",
                    "enum": ["billing", "technical", "shipping", "account", "general", "all"],
                    "description": "Filter results by category, or use 'all' for broad search"
                }
            },
            "required": ["query", "category"]
        }
    },
    {
        "name": "escalate_to_human",
        "description": (
            "Escalates the conversation to a human support agent. Use this when "
            "the customer is frustrated, the issue is complex, involves legal or "
            "financial risk, or the customer explicitly requests a human."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "reason": {
                    "type": "string",
                    "description": "Why this is being escalated — be specific"
                },
                "urgency": {
                    "type": "string",
                    "enum": ["normal", "high", "immediate"],
                    "description": "How quickly a human needs to respond"
                },
                "summary": {
                    "type": "string",
                    "description": "Brief summary of the conversation so far for the human agent"
                }
            },
            "required": ["reason", "urgency", "summary"]
        }
    }
]

The input_schema block is what Claude uses to understand exactly what parameters each tool expects. The more precise your descriptions, the better Claude picks the right tool at the right time.

Step 3: Build the Main Agent Loop with Tool Use

Now we build the core of the agent — the class that wraps the conversation loop, handles tool calls, and routes them to real Python functions. This is the pattern I use in production.

support_agent.py (agent class and tool handlers)

class CustomerSupportAgent:
    def __init__(self):
        self.client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
        self.conversation_history = []
        self.tickets = []  # In production, this would be your ticketing system
        self.model = "claude-sonnet-4-6"
        self.system_prompt = """You are a helpful customer support agent for Acme Corp.
Your job is to resolve customer issues efficiently and empathetically.

Guidelines:
- Always search the knowledge base before creating a ticket
- Create tickets for issues that need follow-up or can't be resolved now
- Escalate when a customer is upset, requests a human, or the issue is high-risk
- Be concise — customers want answers, not paragraphs
- Confirm ticket and escalation details with the customer"""

    def _execute_tool(self, tool_name: str, tool_input: dict) -> str:
        """Routes tool calls to their implementations and returns a result string."""
        if tool_name == "create_ticket":
            return self._create_ticket(tool_input)
        elif tool_name == "search_knowledge_base":
            return self._search_knowledge_base(tool_input)
        elif tool_name == "escalate_to_human":
            return self._escalate_to_human(tool_input)
        else:
            return json.dumps({"error": f"Unknown tool: {tool_name}"})

    def _create_ticket(self, params: dict) -> str:
        ticket_id = f"TKT-{str(uuid.uuid4())[:8].upper()}"
        ticket = {
            "id": ticket_id,
            "subject": params["subject"],
            "description": params["description"],
            "priority": params["priority"],
            "category": params["category"],
            "status": "open",
            "created_at": datetime.now().isoformat()
        }
        self.tickets.append(ticket)
        return json.dumps({
            "success": True,
            "ticket_id": ticket_id,
            "message": f"Ticket {ticket_id} created successfully",
            "estimated_response": "2-4 business hours for medium priority"
        })

    def _search_knowledge_base(self, params: dict) -> str:
        # Simulated KB — replace with your actual vector search or CMS lookup
        knowledge_base = {
            "password reset": {
                "title": "How to Reset Your Password",
                "content": "Visit /account/reset, enter your email, and follow the link sent. Links expire in 15 minutes.",
                "category": "account"
            },
            "refund policy": {
                "title": "Refund and Return Policy",
                "content": "Full refunds available within 30 days of purchase. Contact [email protected] for exceptions.",
                "category": "billing"
            },
            "shipping time": {
                "title": "Standard Shipping Estimates",
                "content": "Standard shipping is 5-7 business days. Expedited is 2-3 days. International is 10-14 days.",
                "category": "shipping"
            },
            "cancel subscription": {
                "title": "How to Cancel Your Subscription",
                "content": "Go to Settings > Billing > Cancel Plan. Cancellation takes effect at end of the billing period.",
                "category": "billing"
            }
        }

        query_lower = params["query"].lower()
        results = []

        for key, article in knowledge_base.items():
            # Simple keyword matching — swap for semantic search in production
            if any(word in query_lower for word in key.split()):
                if params["category"] == "all" or article["category"] == params["category"]:
                    results.append(article)

        if results:
            return json.dumps({
                "found": True,
                "results": results,
                "count": len(results)
            })
        return json.dumps({
            "found": False,
            "results": [],
            "message": "No matching articles found. Consider creating a ticket."
        })

    def _escalate_to_human(self, params: dict) -> str:
        escalation_id = f"ESC-{str(uuid.uuid4())[:8].upper()}"
        return json.dumps({
            "success": True,
            "escalation_id": escalation_id,
            "reason": params["reason"],
            "urgency": params["urgency"],
            "summary": params["summary"],
            "message": "A human agent has been notified and will respond shortly.",
            "estimated_wait": "Under 5 minutes for high urgency"
        })

    def chat(self, user_message: str) -> str:
        """Processes a user message through the full agent loop."""
        self.conversation_history.append({
            "role": "user",
            "content": user_message
        })

        while True:
            response = self.client.messages.create(
                model=self.model,
                max_tokens=1024,
                system=self.system_prompt,
                tools=TOOLS,
                messages=self.conversation_history
            )

            # If Claude is done thinking and has a final text response
            if response.stop_reason == "end_turn":
                assistant_message = response.content[0].text
                self.conversation_history.append({
                    "role": "assistant",
                    "content": response.content
                })
                return assistant_message

            # If Claude wants to use a tool, process all tool calls in this response
            if response.stop_reason == "tool_use":
                self.conversation_history.append({
                    "role": "assistant",
                    "content": response.content
                })

                tool_results = []
                for block in response.content:
                    if block.type == "tool_use":
                        print(f"  🔧 Tool called: {block.name} | Input: {json.dumps(block.input, indent=2)}")
                        result = self._execute_tool(block.name, block.input)
                        tool_results.append({
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": result
                        })

                # Feed all tool results back into the conversation
                self.conversation_history.append({
                    "role": "user",
                    "content": tool_results
                })
                # Loop again — Claude will now generate a response using the tool results

            else:
                # Unexpected stop reason — break to avoid infinite loop
                break

        return "I'm sorry, something went wrong. Please try again."

The while True loop is the key pattern here. Claude might call multiple tools before it has enough information to respond. The loop keeps running until stop_reason is end_turn, meaning Claude is satisfied and ready to reply.

Step 4: Implement Error Handling and Conversation Context

Production agents need to survive bad inputs, API timeouts, and weird edge cases. Here's how I wrap the agent with proper error handling and add a context-aware runner:

support_agent.py (error handling and runner)

import anthropic

def safe_chat(agent: CustomerSupportAgent, user_message: str) -> str:
    """Wraps the agent's chat method with error handling for production use."""
    try:
        return agent.chat(user_message)

    except anthropic.AuthenticationError:
        return "Error: Invalid API key. Check your ANTHROPIC_API_KEY environment variable."

    except anthropic.RateLimitError:
        return "Error: Rate limit hit. Back off and retry in a few seconds."

    except anthropic.APIConnectionError:
        return "Error: Could not connect to Claude API. Check your internet connection."

    except anthropic.BadRequestError as e:
        # This often means the conversation history got too long
        return f"Error: Bad request — {str(e)}. Consider clearing conversation history."

    except Exception as e:
        # Catch-all for anything unexpected
        return f"Unexpected error: {str(e)}"


def run_support_session():
    """Runs an interactive support session in the terminal."""
    agent = CustomerSupportAgent()
    print("=" * 60)
    print("  Acme Corp Customer Support — Powered by Claude")
    print("  Type 'quit' to exit | Type 'history' to see ticket log")
    print("=" * 60)
    print()

    while True:
        user_input = input("You: ").strip()

        if not user_input:
            continue

        if user_input.lower() == "quit":
            print("Session ended. Goodbye!")
            break

        if user_input.lower() == "history":
            if agent.tickets:
                print("\n📋 Tickets created this session:")
                for t in agent.tickets:
                    print(f"  [{t['id']}] {t['subject']} — {t['priority']} priority")
            else:
                print("No tickets created yet.")
            print()
            continue

        print("Agent: ", end="", flush=True)
        response = safe_chat(agent, user_input)
        print(response)
        print()


if __name__ == "__main__":
    run_support_session()

Step 5: Test with Real Support Scenarios

Here's how to run automated tests against the agent so you can verify all three tools work before you ship anything. These mirror the kinds of conversations real customers have.

test_support_agent.py

import os
from support_agent import CustomerSupportAgent, safe_chat

def test_knowledge_base_lookup():
    """Test that the agent finds KB articles before creating a ticket."""
    agent = CustomerSupportAgent()
    response = safe_chat(agent, "How do I reset my password?")
    print("\n[TEST 1] Knowledge Base Lookup")
    print(f"Response: {response}")
    assert "reset" in response.lower() or "password" in response.lower(), \
        "Expected password reset info in response"
    print("✓ PASSED\n")

def test_ticket_creation():
    """Test that the agent creates a ticket for unresolved issues."""
    agent = CustomerSupportAgent()
    response = safe_chat(
        agent,
        "My order #77341 has been stuck in processing for 3 weeks "
        "and nobody is responding to my emails."
    )
    print("[TEST 2] Ticket Creation")
    print(f"Response: {response}")
    # Check that a ticket was actually created in the agent's ticket log
    assert len(agent.tickets) > 0, "Expected at least one ticket to be created"
    print(f"Ticket created: {agent.tickets[0]['id']}")
    print("✓ PASSED\n")

def test_escalation():
    """Test that the agent escalates when the customer requests a human."""
    agent = CustomerSupportAgent()
    response = safe_chat(
        agent,
        "I've been dealing with this billing issue for a month. "
        "I'm extremely frustrated and I want to speak to a real person right now."
    )
    print("[TEST 3] Escalation")
    print(f"Response: {response}")
    assert any(
        word in response.lower()
        for word in ["human", "agent", "escalat", "transfer", "notified"]
    ), "Expected escalation confirmation in response"
    print("✓ PASSED\n")

def test_multi_turn_context():
    """Test that the agent remembers context across multiple turns."""
    agent = CustomerSupportAgent()

    # Turn 1: set up context
    safe_chat(agent, "Hi, I'm having trouble with my subscription cancellation.")

    # Turn 2: follow-up that requires memory of turn 1
    response = safe_chat(agent, "I tried what you suggested but it still shows as active.")
    print("[TEST 4] Multi-Turn Context")
    print(f"Response: {response}")
    assert len(agent.conversation_history) >= 4, \
        "Expected conversation history to have at least 4 entries"
    print("✓ PASSED\n")


if __name__ == "__main__":
    print("Running support agent tests...\n")
    test_knowledge_base_lookup()
    test_ticket_creation()
    test_escalation()
    test_multi_turn_context()
    print("All tests passed ✓")

Here's what the output looks like when you run this:

Sample Output

Running support agent tests...

[TEST 1] Knowledge Base Lookup
  🔧 Tool called: search_knowledge_base | Input: {
    "query": "reset password",
    "category": "account"
  }
Response: I found the answer in our knowledge base! To reset your password,
visit /account/reset, enter your email address, and click the link we send you.
Note that the reset link expires after 15 minutes, so use it promptly.
✓ PASSED

[TEST 2] Ticket Creation
  🔧 Tool called: search_knowledge_base | Input: {
    "query": "order stuck processing",
    "category": "shipping"
  }
  🔧 Tool called: create_ticket | Input: {
    "subject": "Order #77341 stuck in processing for 3 weeks",
    "description": "Customer reports order #77341 has been in processing status...",
    "priority": "high",
    "category": "shipping"
  }
Response: I'm sorry for the frustration with order #77341. I've created a high-priority
ticket (TKT-A3F9B2C1) for your issue. Our team will follow up within 2-4 hours.
Ticket created: TKT-A3F9B2C1
✓ PASSED

[TEST 3] Escalation
  🔧 Tool called: escalate_to_human | Input: {
    "reason": "Customer explicitly requested human agent, high frustration level",
    "urgency": "high",
    "summary": "Customer has had unresolved billing issue for one month..."
  }
Response: I completely understand your frustration, and I've immediately connected
you with a human support agent. They've been notified and will reach out within
5 minutes. Escalation ID: ESC-D71F3A9B.
✓ PASSED

[TEST 4] Multi-Turn Context
Response: I understand — let's dig into this further. Since your subscription
is still showing as active after following the cancellation steps, there may be
a sync issue on our end. Let me create a ticket so our billing team can
manually review and confirm your cancellation.
✓ PASSED

All tests passed ✓

How It Works

Here's the plain-English version of what's happening under the hood. When you send a message, it gets added to conversation_history and sent to Claude along with the tool definitions. Claude reads the message, decides if it needs more information (a KB lookup) or needs to take action (create a ticket), and returns a tool_use response instead of text.

Your code executes that tool, gets a result, and feeds it back to Claude as a tool_result message. Claude then uses