If you're trying to build AI agents that do real work — not just answer questions — lead scoring is one of the best places to start. Real estate agents and brokerages spend hours every week manually reviewing leads, guessing who's serious and who's just browsing. This tutorial shows you exactly how to automate that with a Python agent powered by the Claude API.
By the end, you'll have a working agent that takes raw lead data, reasons through qualification criteria using tools, and returns a structured score with written reasoning. This is the same pattern we use at Naples AI when building custom AI solutions for real estate clients in Southwest Florida.
What You'll Build
You'll build a Python-based AI lead scoring agent that accepts real estate lead information — things like budget, timeline, pre-approval status, and engagement history — and returns a 1–10 score with a written explanation. The agent uses Claude's tool use feature to call structured scoring functions rather than just generating free-form text.
The result is a production-ready script you can plug into a CRM webhook, a Zapier pipeline, or a backend API. It runs in under 30 minutes to set up.
Prerequisites
- Python 3.9 or higher installed
- An Anthropic API key (get one at console.anthropic.com)
- Basic familiarity with Python functions and dictionaries
anthropicPython SDK installed (pip install anthropic)- A terminal or code editor like VS Code
claude-sonnet-4-6 as the model. You can copy each block in order and assemble the final script — or jump to Step 3 for the complete agentic loop that ties everything together.
Step 1: Set Up Your Claude API Environment and Dependencies
First things first — let's get the environment wired up. You need the Anthropic SDK installed and your API key loaded from an environment variable so you're not hardcoding credentials anywhere.
Create a file called lead_scoring_agent.py and start with this initialization block. This is everything you need before writing a single line of agent logic.
import os import json from anthropic import Anthropic # Load the API key from your environment # Run: export ANTHROPIC_API_KEY="your-key-here" before executing client = Anthropic() MODEL = "claude-sonnet-4-6" # System prompt gives the agent its persona and task context SYSTEM_PROMPT = """You are a real estate lead qualification specialist. Your job is to evaluate incoming leads and score them based on buying intent, financial readiness, and timeline. Always use the available tools to record your scoring decisions. Be precise and back every score with specific reasoning."""
The Anthropic() client automatically picks up your ANTHROPIC_API_KEY environment variable. You don't need to pass it explicitly. If the key isn't set, you'll get an AuthenticationError immediately — which is a good fail-fast behavior.
The system prompt matters more than most tutorials admit. Telling Claude it's a "lead qualification specialist" shapes how it reasons through ambiguous cases. Generic prompts produce generic results.
Step 2: Define Lead Scoring Tool Schemas and Functions
This is where Claude's tool use feature does the heavy lifting. Instead of asking Claude to just write a score in plain text, you define a structured tool with a JSON schema. Claude is forced to populate specific fields — score, tier, reasoning — every single time.
This is what makes the output machine-readable and reliable. You can parse it, store it in a database, or feed it into another system without wrestling with unstructured text.
lead_scoring_agent.py — Tool Definitionsimport os
import json
from anthropic import Anthropic
client = Anthropic()
MODEL = "claude-sonnet-4-6"
SYSTEM_PROMPT = """You are a real estate lead qualification specialist.
Your job is to evaluate incoming leads and score them based on buying intent,
financial readiness, and timeline. Always use the available tools to record
your scoring decisions. Be precise and back every score with specific reasoning."""
# Tool schema tells Claude exactly what fields to populate
TOOLS = [
{
"name": "score_lead",
"description": (
"Score a real estate lead based on their profile information. "
"Call this tool once after analyzing the lead. "
"Return a numeric score, a qualification tier, and detailed reasoning."
),
"input_schema": {
"type": "object",
"properties": {
"lead_name": {
"type": "string",
"description": "Full name of the lead"
},
"score": {
"type": "integer",
"description": "Lead quality score from 1 (cold) to 10 (hot)",
"minimum": 1,
"maximum": 10
},
"tier": {
"type": "string",
"enum": ["hot", "warm", "cold"],
"description": "Qualification tier based on the score"
},
"reasoning": {
"type": "string",
"description": "Detailed explanation of why this score was assigned"
},
"recommended_action": {
"type": "string",
"description": "Next step the agent should take with this lead"
},
"disqualifying_factors": {
"type": "array",
"items": {"type": "string"},
"description": "Any red flags or reasons the lead might not convert"
}
},
"required": ["lead_name", "score", "tier", "reasoning", "recommended_action"]
}
}
]
def execute_score_lead(tool_input: dict) -> dict:
"""
Execute the lead scoring tool and return a structured result.
In production, you'd write this to a CRM or database here.
"""
result = {
"status": "scored",
"lead_name": tool_input["lead_name"],
"score": tool_input["score"],
"tier": tool_input["tier"],
"reasoning": tool_input["reasoning"],
"recommended_action": tool_input["recommended_action"],
"disqualifying_factors": tool_input.get("disqualifying_factors", [])
}
return result
The enum constraint on the tier field is important. Without it, Claude might return "high priority" one time and "warm prospect" the next. Enums lock in consistent values so your downstream systems don't break.
The execute_score_lead function is where you'd plug in a real CRM write, a webhook call, or a database insert. Right now it just returns the structured result — which is all we need for testing and demonstration.
Step 3: Build the Agent with Tool Use and Response Handling
Now we write the agentic loop. This is the part that actually drives the agent — sending messages, detecting when Claude calls a tool, executing that tool, and feeding the result back into the conversation until we get a final response.
The loop below handles the full Claude tool use lifecycle. It's the same pattern for any agent you build, not just this one.
lead_scoring_agent.py — Complete Agent with Agentic Loopimport os
import json
from anthropic import Anthropic
client = Anthropic()
MODEL = "claude-sonnet-4-6"
SYSTEM_PROMPT = """You are a real estate lead qualification specialist.
Your job is to evaluate incoming leads and score them based on buying intent,
financial readiness, and timeline. Always use the available tools to record
your scoring decisions. Be precise and back every score with specific reasoning."""
TOOLS = [
{
"name": "score_lead",
"description": (
"Score a real estate lead based on their profile information. "
"Call this tool once after analyzing the lead. "
"Return a numeric score, a qualification tier, and detailed reasoning."
),
"input_schema": {
"type": "object",
"properties": {
"lead_name": {
"type": "string",
"description": "Full name of the lead"
},
"score": {
"type": "integer",
"description": "Lead quality score from 1 (cold) to 10 (hot)",
"minimum": 1,
"maximum": 10
},
"tier": {
"type": "string",
"enum": ["hot", "warm", "cold"],
"description": "Qualification tier based on the score"
},
"reasoning": {
"type": "string",
"description": "Detailed explanation of why this score was assigned"
},
"recommended_action": {
"type": "string",
"description": "Next step the agent should take with this lead"
},
"disqualifying_factors": {
"type": "array",
"items": {"type": "string"},
"description": "Any red flags or reasons the lead might not convert"
}
},
"required": ["lead_name", "score", "tier", "reasoning", "recommended_action"]
}
}
]
def execute_score_lead(tool_input: dict) -> dict:
"""Execute the lead scoring tool and return a structured result."""
result = {
"status": "scored",
"lead_name": tool_input["lead_name"],
"score": tool_input["score"],
"tier": tool_input["tier"],
"reasoning": tool_input["reasoning"],
"recommended_action": tool_input["recommended_action"],
"disqualifying_factors": tool_input.get("disqualifying_factors", [])
}
return result
def run_lead_scoring_agent(lead_data: dict) -> dict:
"""
Run the full agentic loop to score a single lead.
Returns the structured scoring result from the tool call.
"""
# Format lead data into a natural language message for the agent
lead_message = f"""
Please evaluate and score this real estate lead:
Name: {lead_data.get('name', 'Unknown')}
Budget: {lead_data.get('budget', 'Not provided')}
Timeline: {lead_data.get('timeline', 'Not provided')}
Pre-approved: {lead_data.get('pre_approved', 'Unknown')}
Property type: {lead_data.get('property_type', 'Not specified')}
Location interest: {lead_data.get('location', 'Not specified')}
Contact method: {lead_data.get('contact_method', 'Not specified')}
Engagement notes: {lead_data.get('engagement_notes', 'None')}
Use the score_lead tool to record your qualification decision.
"""
messages = [{"role": "user", "content": lead_message}]
final_result = None
# Agentic loop — runs until Claude stops calling tools
while True:
response = client.messages.create(
model=MODEL,
max_tokens=1024,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages
)
# Append Claude's response to the conversation history
messages.append({"role": "assistant", "content": response.content})
# Check if Claude is done (no more tool calls)
if response.stop_reason == "end_turn":
break
# Handle tool use blocks in the response
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
tool_use_id = block.id
# Route to the correct tool function
if tool_name == "score_lead":
result = execute_score_lead(tool_input)
final_result = result # Capture the scoring result
else:
result = {"error": f"Unknown tool: {tool_name}"}
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use_id,
"content": json.dumps(result)
})
# Feed tool results back to Claude to continue the conversation
messages.append({"role": "user", "content": tool_results})
else:
# Unexpected stop reason — exit the loop safely
break
return final_result if final_result else {"error": "Agent did not produce a score"}
def score_multiple_leads(leads: list) -> list:
"""Score a batch of leads and return all results."""
results = []
for lead in leads:
print(f"Scoring lead: {lead.get('name', 'Unknown')}...")
result = run_lead_scoring_agent(lead)
results.append(result)
return results
while True loop handles that automatically. If Claude responds with stop_reason: "end_turn" without calling the tool, the loop exits cleanly and the error fallback catches it.
Step 4: Test the Agent with Sample Real Estate Leads
Let's run the agent against three realistic leads — a hot buyer, a warm prospect, and a cold tire-kicker. This shows you what the output actually looks like and confirms everything is wired up correctly.
Add this block to the bottom of your script and run it with python lead_scoring_agent.py.
import os
import json
from anthropic import Anthropic
client = Anthropic()
MODEL = "claude-sonnet-4-6"
SYSTEM_PROMPT = """You are a real estate lead qualification specialist.
Your job is to evaluate incoming leads and score them based on buying intent,
financial readiness, and timeline. Always use the available tools to record
your scoring decisions. Be precise and back every score with specific reasoning."""
TOOLS = [
{
"name": "score_lead",
"description": (
"Score a real estate lead based on their profile information. "
"Call this tool once after analyzing the lead. "
"Return a numeric score, a qualification tier, and detailed reasoning."
),
"input_schema": {
"type": "object",
"properties": {
"lead_name": {"type": "string", "description": "Full name of the lead"},
"score": {
"type": "integer",
"description": "Lead quality score from 1 (cold) to 10 (hot)",
"minimum": 1,
"maximum": 10
},
"tier": {
"type": "string",
"enum": ["hot", "warm", "cold"],
"description": "Qualification tier based on the score"
},
"reasoning": {
"type": "string",
"description": "Detailed explanation of why this score was assigned"
},
"recommended_action": {
"type": "string",
"description": "Next step the agent should take with this lead"
},
"disqualifying_factors": {
"type": "array",
"items": {"type": "string"},
"description": "Any red flags or reasons the lead might not convert"
}
},
"required": ["lead_name", "score", "tier", "reasoning", "recommended_action"]
}
}
]
def execute_score_lead(tool_input: dict) -> dict:
result = {
"status": "scored",
"lead_name": tool_input["lead_name"],
"score": tool_input["score"],
"tier": tool_input["tier"],
"reasoning": tool_input["reasoning"],
"recommended_action": tool_input["recommended_action"],
"disqualifying_factors": tool_input.get("disqualifying_factors", [])
}
return result
def run_lead_scoring_agent(lead_data: dict) -> dict:
lead_message = f"""
Please evaluate and score this real estate lead:
Name: {lead_data.get('name', 'Unknown')}
Budget: {lead_data.get('budget', 'Not provided')}
Timeline: {lead_data.get('timeline', 'Not provided')}
Pre-approved: {lead_data.get('pre_approved', 'Unknown')}
Property type: {lead_data.get('property_type', 'Not specified')}
Location interest: {lead_data.get('location', 'Not specified')}
Contact method: {lead_data.get('contact_method', 'Not specified')}
Engagement notes: {lead_data.get('engagement_notes', 'None')}
Use the score_lead tool to record your qualification decision.
"""
messages = [{"role": "user", "content": lead_message}]
final_result = None
while True:
response = client.messages.create(
model=MODEL,
max_tokens=1024,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages
)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
break
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
tool_use_id = block.id
if tool_name == "score_lead":
result = execute_score_lead(tool_input)
final_result = result
else:
result = {"error": f"Unknown tool: {tool_name}"}
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use_id,
"content": json.dumps(result)
})
messages.append({"role": "user", "content": tool_results})
else:
break
return final_result if final_result else {"error": "Agent did not produce a score"}
def score_multiple_leads(leads: list) -> list:
results = []
for lead in leads:
print(f"Scoring lead: {lead.get('name', 'Unknown')}...")
result = run_lead_scoring_agent(lead)
results.append(result)
return results
# --- Sample leads for testing ---
SAMPLE_LEADS = [
{
"name": "Maria Gonzalez",
"budget": "$850,000 - $1,200,000",
"timeline": "Wants to close within 60 days",
"pre_approved": "Yes, pre-approved for $1.1M",
"property_type": "Single-family home, 4BR+",
"location": "Naples, FL — Port Royal or Aqualane Shores preferred",
"contact_method": "Called twice, responded to all emails within the hour",
"engagement_notes": "Attended two open houses, asked detailed questions about HOA and flood zones"
},
{
"name": "Derek Thompson",
"budget": "$400,000 - $550,000",
"timeline": "Maybe in the next 6-12 months",
"pre_approved": "Not yet, planning to start the process soon",
"property_type": "Condo or townhouse",
"location": "Naples or Bonita Springs",
"contact_method": "Submitted a contact form on the website",
"engagement_notes": "Downloaded a buyer's guide, opened 3 listing emails but hasn't clicked"
},
{
"name": "Steve Kaufman",
"budget": "Not sure yet",
"timeline": "Just looking for now, no rush",
"pre_approved": "No",
"property_type": "Anything in a good neighborhood",
"location": "Somewhere in Florida",
"contact_method": "Chatted on the website for 2 minutes",
"engagement_notes": "Said he's exploring options, not ready to talk to an agent"
}
]
if __name__ == "__main__":
print("=" * 60)
print("Naples AI — Real Estate Lead Scoring Agent")
print("=" * 60)
results = score_multiple_leads(SAMPLE_LEADS)
print("\n--- SCORING RESULTS ---\n")
for result in results:
print(json.dumps(result, indent=2))
print("-" * 40)
Here's what the actual output looks like when you run this script:
Terminal Output — Scored Leads with Reasoning============================================================
Naples AI — Real Estate Lead Scoring Agent
============================================================
Scoring lead: Maria Gonzalez...
Scoring lead: Derek Thompson...
Scoring lead: Steve Kaufman...
--- SCORING RESULTS ---
{
"status": "scored",
"lead_name": "Maria Gonzalez",
"score": 9,
"tier": "hot",
"reasoning": "Maria is an exceptionally strong lead. She is pre-approved for $1.1M which aligns with her stated budget, has a firm 60-day close timeline, has attended two open houses, and is highly responsive across all contact channels. Her specific neighborhood preferences (Port Royal, Aqualane Shores) indicate serious buyer intent rather than casual browsing. The only reason this is a 9 rather than a 10 is that no offer has been made yet.",
"recommended_action": "Call within the next 2 hours to schedule a private showing. Prepare a curated list of available listings in Port Royal and Aqualane Shores under $1.2