CAP_01
Customer-support agent
Tool-calling agent over docs + ticket history.
Grounded retrieval-augmented agent that resolves tier-1 and tier-2 support tickets, hands off cleanly when uncertain, and writes back to your CRM. Built around a strict refusal policy and a continuously-updated eval set.
Stack
Claude 3.5 SonnetGPT-4o (fallback)pgvectorLangGraphLangfuseTemporal
Architecture sketch
UserInbound message via Slack / web / email
RouterClassifier picks specialist + retrieval mode
RetrieverHybrid BM25 + dense, with metadata filters
PlannerDecomposes into tool calls, max depth 5
ToolsCRM lookup · order status · refund · escalate
VerifierChecks grounding, then commits action
Production metrics
Eval accuracy
94.6%
on golden set, 240 fixtures
p95 latency
812ms
end-to-end
Cost / req
$0.014
incl. retrieval
Auto-resolve
67%
no human required
Escalation
96%
correctly routed
Refusal rate
2.1%
when confidence low
Numbers reflect production runs from founding-team prior work, replicated in our reference stack. Replicable in your account under the audit.
// stack
Technology we ship on.
Models
→Claude (Anthropic)
→GPT-4o / GPT-5
→Llama 3
→Fine-tunes
Retrieval
→pgvector
→Qdrant
→Postgres FTS
→BM25
Orchestration
→LangGraph
→Temporal
→Inngest
→Trigger.dev
Observability
→Langfuse
→OpenTelemetry
→Sentry
→Grafana
Frontend
→Next.js 14+
→React Native
→Tailwind
→shadcn/ui
Infra
→AWS
→Cloudflare
→Vercel
→Modal
→Fly.io
Want one of these in your stack?
Every reference implementation can be adapted, extended, or replaced with something better — with your data and your constraints.