What You'll Build
If you've ever wasted an hour chasing a lead that was never going to convert, this tutorial is for you. You're going to build a fully working AI lead qualifier agent in Python that automatically scores incoming prospects, validates their contact info, and gives you a clear qualified or rejected decision — no manual review needed.
The agent uses Claude's tool-use API to call custom functions, reason about the results, and return a structured lead score with an explanation. By the end, you'll have production-ready code you can drop into any CRM pipeline or web form backend.
The complete, working code for this project is built up step by step in the sections below. Every snippet connects to the next — by Step 5 you'll have the entire agent assembled and tested. No GitHub account required, just copy as you go.
Prerequisites
- Python 3.10 or higher installed
- An Anthropic API key (get one at console.anthropic.com)
- Basic familiarity with Python functions and classes
anthropicandpython-dotenvpackages installed (pip install anthropic python-dotenv)- A
.envfile withANTHROPIC_API_KEY=your_key_here
Step 1: Set Up Your Claude API Client and Environment
First, let's get the foundation in place. Create a new project folder and add a .env file with your API key. Then install the dependencies if you haven't already.
Here's the base setup file. This handles environment loading and gives you a reusable client you'll use throughout the project.
setup.pyimport os
from anthropic import Anthropic
from dotenv import load_dotenv
# Load your ANTHROPIC_API_KEY from the .env file
load_dotenv()
def get_client() -> Anthropic:
api_key = os.getenv("ANTHROPIC_API_KEY")
if not api_key:
raise ValueError("ANTHROPIC_API_KEY not found. Check your .env file.")
return Anthropic(api_key=api_key)
# Quick sanity check — run this file directly to confirm the key loads
if __name__ == "__main__":
client = get_client()
print("Client initialized successfully.")
Run python setup.py to confirm your key loads without errors. If you see the success message, you're ready to move on.
Step 2: Define Tool Functions for Lead Qualification (Email, Phone, Industry)
This is where the agent gets its abilities. Claude doesn't call external APIs on its own — you define tools with a schema, and Claude decides when to use them based on context. Think of tools as the hands that let Claude reach into your business logic.
We're defining three tools: email validation, phone number formatting, and industry scoring. Each one runs real Python logic, not fake data.
tools.pyimport re
# --- Tool: Validate Email Address ---
def validate_email(email: str) -> dict:
"""
Checks if an email address follows a valid format.
Returns a dict with is_valid (bool) and a reason string.
"""
pattern = r"^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$"
is_valid = bool(re.match(pattern, email.strip()))
return {
"email": email,
"is_valid": is_valid,
"reason": "Valid email format." if is_valid else "Email format is invalid or missing domain."
}
# --- Tool: Validate and Normalize Phone Number ---
def validate_phone(phone: str) -> dict:
"""
Strips non-numeric characters and checks if the result is 10 or 11 digits (US numbers).
"""
digits_only = re.sub(r"\D", "", phone)
is_valid = len(digits_only) in (10, 11)
return {
"phone": phone,
"normalized": digits_only,
"is_valid": is_valid,
"reason": "Valid US phone number." if is_valid else f"Expected 10-11 digits, got {len(digits_only)}."
}
# --- Tool: Score Industry Fit ---
def score_industry(industry: str) -> dict:
"""
Returns a fit score (0-100) based on how well the industry aligns
with Naples AI's core service verticals.
"""
# High-value verticals we serve directly
high_fit = {
"real estate": 95,
"healthcare": 90,
"restaurant": 80,
"restaurants": 80,
"automotive": 85,
"car dealership": 85,
"manufacturing": 88,
"legal": 75,
"finance": 78,
}
# Moderate fit — we can help but it's not our primary focus
moderate_fit = {
"retail": 60,
"e-commerce": 62,
"education": 55,
"nonprofit": 50,
"construction": 65,
}
industry_lower = industry.strip().lower()
if industry_lower in high_fit:
score = high_fit[industry_lower]
tier = "high"
elif industry_lower in moderate_fit:
score = moderate_fit[industry_lower]
tier = "moderate"
else:
score = 30
tier = "low"
return {
"industry": industry,
"fit_score": score,
"fit_tier": tier,
"reason": f"{industry} is a {tier}-fit industry for AI automation services."
}
These functions are deliberately simple — they run fast and return clean dictionaries that Claude can read and reason about. You can extend them later with real third-party APIs like Twilio Lookup or ZeroBounce.
Step 3: Build the Agent Loop with Tool Use
This is the core of the whole project. The agent loop sends a lead to Claude, receives tool call requests back, runs the matching Python functions, feeds the results back to Claude, and repeats until Claude has everything it needs to make a decision.
Claude uses tool_use content blocks to tell you which tool to call and with what arguments. Your job is to route those calls to the right Python function and return the result as a tool_result message.
import json
import os
from anthropic import Anthropic
from dotenv import load_dotenv
from tools import validate_email, validate_phone, score_industry
load_dotenv()
# --- Tool Schema Definitions for Claude ---
TOOL_DEFINITIONS = [
{
"name": "validate_email",
"description": "Validates whether an email address has a correct format.",
"input_schema": {
"type": "object",
"properties": {
"email": {
"type": "string",
"description": "The email address to validate."
}
},
"required": ["email"]
}
},
{
"name": "validate_phone",
"description": "Validates and normalizes a US phone number.",
"input_schema": {
"type": "object",
"properties": {
"phone": {
"type": "string",
"description": "The phone number to validate (any format)."
}
},
"required": ["phone"]
}
},
{
"name": "score_industry",
"description": "Returns an industry fit score (0-100) based on alignment with AI automation service verticals.",
"input_schema": {
"type": "object",
"properties": {
"industry": {
"type": "string",
"description": "The industry the lead works in."
}
},
"required": ["industry"]
}
}
]
# Maps tool names to their actual Python functions
TOOL_REGISTRY = {
"validate_email": validate_email,
"validate_phone": validate_phone,
"score_industry": score_industry,
}
def run_tool(tool_name: str, tool_input: dict) -> str:
"""Calls the correct tool function and returns the result as a JSON string."""
if tool_name not in TOOL_REGISTRY:
return json.dumps({"error": f"Unknown tool: {tool_name}"})
result = TOOL_REGISTRY[tool_name](**tool_input)
return json.dumps(result)
class LeadQualifierAgent:
def __init__(self):
self.client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
self.model = "claude-sonnet-4-6"
self.system_prompt = """You are a lead qualification agent for Naples AI, an AI solutions agency in Southwest Florida.
Your job is to evaluate incoming sales leads and decide whether they should be qualified or rejected.
For every lead, you MUST:
1. Call validate_email with their email address
2. Call validate_phone with their phone number
3. Call score_industry with their industry
After collecting all three results, provide a final structured assessment with:
- overall_score: a number from 0-100
- decision: either "QUALIFIED" or "REJECTED"
- summary: a 1-2 sentence plain-English explanation of your decision
Reject a lead if:
- Email is invalid
- Phone is invalid
- Industry fit score is below 40
Otherwise qualify them, weighting industry fit heavily."""
def qualify_lead(self, lead: dict) -> dict:
"""
Runs the full agent loop for a single lead.
Returns a dict with the agent's final decision.
"""
# Build the initial user message with lead details
user_message = f"""Please qualify this lead:
Name: {lead.get('name', 'Unknown')}
Email: {lead.get('email', '')}
Phone: {lead.get('phone', '')}
Industry: {lead.get('industry', '')}
Company: {lead.get('company', 'Unknown')}
Monthly Budget: ${lead.get('budget', 0):,}"""
messages = [{"role": "user", "content": user_message}]
# Agent loop — keeps running until Claude stops requesting tools
while True:
response = self.client.messages.create(
model=self.model,
max_tokens=1024,
system=self.system_prompt,
tools=TOOL_DEFINITIONS,
messages=messages
)
# Append Claude's response to the conversation
messages.append({"role": "assistant", "content": response.content})
# If Claude is done using tools, extract the final text response
if response.stop_reason == "end_turn":
final_text = ""
for block in response.content:
if hasattr(block, "text"):
final_text = block.text
break
return {
"lead": lead,
"agent_response": final_text,
"raw_messages": messages
}
# If Claude wants to use tools, process each tool call
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = run_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Feed all tool results back to Claude in one message
messages.append({"role": "user", "content": tool_results})
The loop exits naturally when Claude returns stop_reason: "end_turn", which happens once it has all the tool data it needs and has written its final answer. You don't need to manage any state manually — the conversation history in messages handles everything.
Step 4: Implement Lead Scoring Logic
Now we tie everything together with a scoring wrapper that processes multiple leads in a batch and formats the output cleanly. This is the layer your CRM or web backend would actually call.
We're also adding a simple parser here that extracts the structured decision from Claude's free-text response so you get a machine-readable result alongside the human-readable explanation.
lead_scorer.pyimport re
import json
from agent import LeadQualifierAgent
def parse_decision_from_response(agent_response: str) -> dict:
"""
Extracts overall_score, decision, and summary from Claude's text response.
Falls back gracefully if the format isn't exactly as expected.
"""
parsed = {
"overall_score": None,
"decision": "UNKNOWN",
"summary": agent_response
}
# Look for a score like "overall_score: 82" or "Overall Score: 82"
score_match = re.search(r"overall[_\s]score[:\s]+(\d+)", agent_response, re.IGNORECASE)
if score_match:
parsed["overall_score"] = int(score_match.group(1))
# Look for an explicit QUALIFIED or REJECTED decision
if "QUALIFIED" in agent_response.upper():
parsed["decision"] = "QUALIFIED"
elif "REJECTED" in agent_response.upper():
parsed["decision"] = "REJECTED"
# Try to find a summary sentence after "summary:"
summary_match = re.search(r"summary[:\s]+(.+?)(?:\n|$)", agent_response, re.IGNORECASE | re.DOTALL)
if summary_match:
parsed["summary"] = summary_match.group(1).strip()
return parsed
def score_leads(leads: list[dict]) -> list[dict]:
"""
Runs the lead qualifier agent on a list of leads.
Returns a list of results with parsed decisions.
"""
agent = LeadQualifierAgent()
results = []
for i, lead in enumerate(leads, 1):
print(f"\n{'='*50}")
print(f"Processing lead {i}/{len(leads)}: {lead.get('name', 'Unknown')}")
print(f"{'='*50}")
result = agent.qualify_lead(lead)
parsed = parse_decision_from_response(result["agent_response"])
final_result = {
"name": lead.get("name"),
"company": lead.get("company"),
"decision": parsed["decision"],
"overall_score": parsed["overall_score"],
"summary": parsed["summary"],
"raw_agent_response": result["agent_response"]
}
results.append(final_result)
# Print a clean summary to the console
print(f"\nDecision: {final_result['decision']}")
print(f"Score: {final_result['overall_score']}")
print(f"Summary: {final_result['summary']}")
return results
Step 5: Test with Real Lead Data
Let's run the agent against a realistic batch of leads — some should qualify, some should get rejected. This is what you'd actually test before plugging this into a live form submission handler.
Run this file directly and watch the agent work through each lead one by one.
main.pyimport json
from lead_scorer import score_leads
# Sample leads — mix of good, borderline, and bad
TEST_LEADS = [
{
"name": "Maria Gonzalez",
"email": "[email protected]",
"phone": "(239) 555-0182",
"industry": "real estate",
"company": "Naples Properties Group",
"budget": 3500
},
{
"name": "James Whitfield",
"email": "jwhitfield@not-an-email", # invalid email
"phone": "239-555-0041",
"industry": "healthcare",
"company": "Gulf Coast Medical",
"budget": 5000
},
{
"name": "Tony Russo",
"email": "[email protected]",
"phone": "23955501", # too short — invalid phone
"industry": "car dealership",
"company": "Russo Auto Group",
"budget": 2800
},
{
"name": "Sandra Park",
"email": "[email protected]",
"phone": "2395550093",
"industry": "restaurant",
"company": "Bonita Bistro",
"budget": 1200
},
{
"name": "Derek Olson",
"email": "[email protected]",
"phone": "2395550317",
"industry": "hobby shop", # low-fit industry
"company": "Olson Craft Supply",
"budget": 400
}
]
if __name__ == "__main__":
print("Naples AI — Lead Qualifier Agent")
print("Running qualification on 5 test leads...\n")
results = score_leads(TEST_LEADS)
print("\n\n" + "="*50)
print("FINAL RESULTS SUMMARY")
print("="*50)
qualified = [r for r in results if r["decision"] == "QUALIFIED"]
rejected = [r for r in results if r["decision"] == "REJECTED"]
print(f"\n✅ Qualified: {len(qualified)}")
for r in qualified:
print(f" - {r['name']} ({r['company']}) — Score: {r['overall_score']}")
print(f"\n❌ Rejected: {len(rejected)}")
for r in rejected:
print(f" - {r['name']} ({r['company']}) — Score: {r['overall_score']}")
print("\n" + "="*50)
print("Full JSON output:")
print(json.dumps(results, indent=2))
Here's what the console output looks like when you run it:
example outputNaples AI — Lead Qualifier Agent Running qualification on 5 test leads... ================================================== Processing lead 1/5: Maria Gonzalez ================================================== Decision: QUALIFIED Score: 92 Summary: Maria has a valid email, valid phone, and works in real estate — a top-tier fit for Naples AI automation services. ================================================== Processing lead 2/5: James Whitfield ================================================== Decision: REJECTED Score: 15 Summary: Email address is invalid, which disqualifies this lead regardless of industry fit. ================================================== Processing lead 3/5: Tony Russo ================================================== Decision: REJECTED Score: 20 Summary: Phone number could not be validated — too few digits. Cannot follow up without a working contact number. ================================================== Processing lead 4/5: Sandra Park ================================================== Decision: QUALIFIED Score: 78 Summary: Valid contact info and restaurant industry score of 80 make Sandra a solid prospect for AI automation. ================================================== Processing lead 5/5: Derek Olson ================================================== Decision: REJECTED Score: 28 Summary: Hobby shop is a low-fit industry with a score of 30, and the budget is below our minimum threshold. ================================================== FINAL RESULTS SUMMARY ================================================== ✅ Qualified: 2 - Maria Gonzalez (Naples Properties Group) — Score: 92 - Sandra Park (Bonita Bistro) — Score: 78 ❌ Rejected: 3 - James Whitfield (Gulf Coast Medical) — Score: 15 - Tony Russo (Russo Auto Group) — Score: 20 - Derek Olson (Olson Craft Supply) — Score: 28
How It Works
Here's what's actually happening under the hood, in plain English. You send Claude a lead with a system prompt that tells it to call three specific tools before making a decision. Claude reads the lead details and starts sending back tool_use blocks — essentially saying "call this function with these arguments."
Your agent loop intercepts those requests, runs the actual Python functions, and sends the results back as tool_result messages. Claude reads the results and either asks for more tools or writes its final answer.
The whole thing is stateless from your server's perspective — each call to qualify_lead() is self-contained. The conversation history lives in the messages list for the duration of one lead's qualification and then gets discarded. That makes this easy to scale horizontally.
You could describe the lead in a prompt and ask Claude to guess whether the email is valid. But Claude doesn't actually run code — it pattern-matches from training. Tool use lets you run real validation logic and hand Claude verified facts to reason about. That's the difference between a demo and something you can trust in production.
Common Errors and Fixes
Error 1: AuthenticationError — Invalid API Key
Exact error: anthropic.AuthenticationError: 401 {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"}}
This almost always means your