How We Built a Voice AI System That Handles Real Healthcare Calls
Building a voice AI system that handles real healthcare calls is fundamentally different from building a chatbot. When a patient calls to reschedule their appointment, they expect a natural conversation — not a phone tree.
The Problem We Solved
Healthcare offices lose over 30% of incoming calls to hold times. Every missed call is a missed appointment, which means lost revenue and frustrated patients. Staff spend hours on repetitive scheduling tasks instead of patient care.
MDFit Nova-Sonic changes that equation. It answers every call, understands natural speech, and manages appointments in real-time.
Architecture Overview
The system is built on Amazon Nova-Sonic for real-time voice processing, with Twilio handling the telephony layer. Here is the high-level flow:
- Patient calls the office number
- Twilio routes the call to our WebSocket endpoint
- Amazon Nova-Sonic processes speech in real-time
- Our agent system determines intent and takes action
- The patient hears a natural response within 2 seconds
The 5-Agent System
We use 5 specialized AI agents, each handling a specific domain:
- Scheduling Agent: Books new appointments based on provider availability
- Rescheduling Agent: Handles date/time changes with conflict detection
- Cancellation Agent: Processes cancellations with confirmation
- Information Agent: Answers questions about office hours, locations, and providers
- Escalation Agent: Routes complex cases to human staff
Each agent has its own prompt, tool set, and memory context. The orchestrator routes conversations to the right agent based on intent recognition.
Key Technical Decisions
Real-Time Streaming
We chose WebSocket streaming over REST for audio because latency matters. A 5-second delay in a phone conversation feels broken. Our target was under 2 seconds end-to-end, and we consistently hit it.
HIPAA-Aware Architecture
All AWS services we use are HIPAA-eligible. PHI is encrypted at rest and in transit. We maintain audit logs for every interaction. A Business Associate Agreement (BAA) is available for healthcare clients.
Multi-Tenant Design
The system supports multiple healthcare practices, each with their own providers, schedules, and configurations. This is enterprise software — deployed at Rothman Orthopaedic with real patients calling (844) 699-2336.
Results
- Under 2-second response latency
- 95%+ intent recognition accuracy
- Handles 100+ concurrent calls
- Production-deployed and handling real patient calls
Voice AI in healthcare is not a research project — it is production software serving real patients today.