SwiftX
All case studies
HealthTechAI AgentsVoice AIHealthcare

AI Voice Booking Agent

Major Private Hospital Group (Middle East)

An autonomous voice AI handling end-to-end inbound appointment booking — patient identification, symptom triage, doctor and slot selection, confirmation, and execution — 24/7, in Arabic and English, with no human in the loop.

Hospital reception scene representing automated patient intake
Availability
24/7
Languages
AR · EN
Workflow
7 stages
Hold time
0 sec

AI Voice Booking Agent

The problem

A major private hospital group in the Middle East faced a critical operational bottleneck: their appointment booking system relied heavily on human call-center agents handling thousands of daily phone calls. The work was high-volume, multilingual (Arabic and English), restricted to business hours, and inconsistent — every patient experience depended on which agent picked up.

Patients didn't always know which department they needed. They'd describe symptoms ("I have chest pain") and rely on a receptionist's judgment. Manual data entry introduced errors. Out-of-hours callers had to leave messages or retry.

The approach

We built an AI agent that handles the full appointment booking workflow over a phone call — autonomously, 24/7. Three design principles drove the architecture:

  1. Conversational, not robotic — natural dialogue, no menu trees.
  2. State-machine precision — the underlying booking workflow is governed by a strict, deterministic state machine. Every booking follows the correct sequence; no steps get skipped.
  3. Fail-safe over fast — guardrails on every stage. The system never gives medical advice. It always confirms details before executing a booking.

The conversation flows through two phases:

Phase 1 — Patient data collection. Greet, identify by phone number, look up in the hospital system. Returning patients are greeted by name; new patients are asked.

Phase 2 — Appointment booking. A 7-stage workflow: department selection → date/time → doctor → review → explicit verbal confirmation → booking execution via the hospital API → completion with appointment ID.

Symptom-to-department routing happens via the conversational LLM rather than a menu — "chest pain" routes to cardiology, "skin rash" to dermatology, fever in a child to pediatrics.

Business impact

  • 24/7 availability — patients book at any hour without staffing constraints.
  • Zero hold time — the AI answers immediately and starts the conversation.
  • Consistent quality — every caller gets the same accurate, professional experience regardless of time or volume.
  • Intelligent triage — symptom-based routing eliminates the guesswork for patients who don't know which specialist they need.
  • Concurrent scale — adding capacity is scaling servers, not hiring and training.

A post-call analytics pipeline runs automatically on every conversation: sentiment analysis, category classification, problematic-call flagging, and PostgreSQL-backed dashboarding for hospital management.

Architecture highlights

LayerTechnology
Application frameworkNestJS 11 + TypeScript 5.7
Conversation state machineLangGraph (PostgreSQL checkpointing via PostgresSaver)
Speech-to-textOpenAI Whisper (via Hugging Face Transformers.js)
Text-to-speechXTTS-v2 (production) / Piper (dev)
Language modelQwen3 14B via Ollama (local) or OpenRouter (cloud)
Phone integrationAsterisk PBX with the ARI client
Browser voiceSocket.IO /voice namespace, FFmpeg audio decode, Hark.js VAD
PersistencePostgreSQL — both LangGraph checkpoints and call analytics
ConfigurationZod-validated environment schemas

The conversation graph splits into two LangGraph subgraphs: a patient-data collection subgraph and an appointment booking subgraph. State transitions are computed by a rule-based engine after each tool call. Every graph invocation is checkpointed — if the server restarts mid-call, state is preserved.

Engineering decisions worth highlighting

  • LangGraph over a custom state machine — gave us checkpointing, subgraphs, conditional routing, and tool integration out of the box.
  • Externalized prompts — every prompt lives in YAML files with template variables. Hospital staff can adjust conversation tone or department mappings without a code deployment.
  • Mock-first development — a complete mock layer of the hospital API let us build and test the system end-to-end without waiting for real API access.
  • Streaming TTS — audio streams in chunks rather than waiting for full synthesis. The AI starts speaking almost immediately.

Why it matters

This project demonstrates that voice-enabled AI can fully automate complex, multi-step healthcare workflows — not just answer simple FAQs. By combining LangGraph's deterministic state management with LLM-powered natural conversation, the system delivers an experience that feels human while operating with machine precision and availability.

The architecture is modular and extensible. New booking flows (lab tests, follow-ups, multi-appointment sequences), additional languages, and new voice channels can be added without restructuring the core system.

Ready to start something useful?

Start a project