AI Voice Booking Agent
Major Private Hospital Group (Middle East)
An autonomous voice AI handling end-to-end inbound appointment booking — patient identification, symptom triage, doctor and slot selection, confirmation, and execution — 24/7, in Arabic and English, with no human in the loop.

- Availability
- 24/7
- Languages
- AR · EN
- Workflow
- 7 stages
- Hold time
- 0 sec
AI Voice Booking Agent
The problem
A major private hospital group in the Middle East faced a critical operational bottleneck: their appointment booking system relied heavily on human call-center agents handling thousands of daily phone calls. The work was high-volume, multilingual (Arabic and English), restricted to business hours, and inconsistent — every patient experience depended on which agent picked up.
Patients didn't always know which department they needed. They'd describe symptoms ("I have chest pain") and rely on a receptionist's judgment. Manual data entry introduced errors. Out-of-hours callers had to leave messages or retry.
The approach
We built an AI agent that handles the full appointment booking workflow over a phone call — autonomously, 24/7. Three design principles drove the architecture:
- Conversational, not robotic — natural dialogue, no menu trees.
- State-machine precision — the underlying booking workflow is governed by a strict, deterministic state machine. Every booking follows the correct sequence; no steps get skipped.
- Fail-safe over fast — guardrails on every stage. The system never gives medical advice. It always confirms details before executing a booking.
The conversation flows through two phases:
Phase 1 — Patient data collection. Greet, identify by phone number, look up in the hospital system. Returning patients are greeted by name; new patients are asked.
Phase 2 — Appointment booking. A 7-stage workflow: department selection → date/time → doctor → review → explicit verbal confirmation → booking execution via the hospital API → completion with appointment ID.
Symptom-to-department routing happens via the conversational LLM rather than a menu — "chest pain" routes to cardiology, "skin rash" to dermatology, fever in a child to pediatrics.
Business impact
- 24/7 availability — patients book at any hour without staffing constraints.
- Zero hold time — the AI answers immediately and starts the conversation.
- Consistent quality — every caller gets the same accurate, professional experience regardless of time or volume.
- Intelligent triage — symptom-based routing eliminates the guesswork for patients who don't know which specialist they need.
- Concurrent scale — adding capacity is scaling servers, not hiring and training.
A post-call analytics pipeline runs automatically on every conversation: sentiment analysis, category classification, problematic-call flagging, and PostgreSQL-backed dashboarding for hospital management.
Architecture highlights
| Layer | Technology |
|---|---|
| Application framework | NestJS 11 + TypeScript 5.7 |
| Conversation state machine | LangGraph (PostgreSQL checkpointing via PostgresSaver) |
| Speech-to-text | OpenAI Whisper (via Hugging Face Transformers.js) |
| Text-to-speech | XTTS-v2 (production) / Piper (dev) |
| Language model | Qwen3 14B via Ollama (local) or OpenRouter (cloud) |
| Phone integration | Asterisk PBX with the ARI client |
| Browser voice | Socket.IO /voice namespace, FFmpeg audio decode, Hark.js VAD |
| Persistence | PostgreSQL — both LangGraph checkpoints and call analytics |
| Configuration | Zod-validated environment schemas |
The conversation graph splits into two LangGraph subgraphs: a patient-data collection subgraph and an appointment booking subgraph. State transitions are computed by a rule-based engine after each tool call. Every graph invocation is checkpointed — if the server restarts mid-call, state is preserved.
Engineering decisions worth highlighting
- LangGraph over a custom state machine — gave us checkpointing, subgraphs, conditional routing, and tool integration out of the box.
- Externalized prompts — every prompt lives in YAML files with template variables. Hospital staff can adjust conversation tone or department mappings without a code deployment.
- Mock-first development — a complete mock layer of the hospital API let us build and test the system end-to-end without waiting for real API access.
- Streaming TTS — audio streams in chunks rather than waiting for full synthesis. The AI starts speaking almost immediately.
Why it matters
This project demonstrates that voice-enabled AI can fully automate complex, multi-step healthcare workflows — not just answer simple FAQs. By combining LangGraph's deterministic state management with LLM-powered natural conversation, the system delivers an experience that feels human while operating with machine precision and availability.
The architecture is modular and extensible. New booking flows (lab tests, follow-ups, multi-appointment sequences), additional languages, and new voice channels can be added without restructuring the core system.
More from HealthTech
All case studies
AI-Enabled Medical Detection Feasibility
Feasibility-and-innovation strategy for an AI-enabled medical detection device — clinical-utility framing, regulatory pathway analysis, and a phased build plan to derisk investment before committing capital.

Tabeeb
Telehealth mobile app for remote medical consultations — Arabic-first, integrated billing, doctor scheduling, and patient records.