Healthcare AI has unique constraints. Patient data is sacred — regulated by HIPAA, protected by law, and absolutely cannot be sent to external cloud APIs. When healthcare providers asked for an AI assistant to help clinicians review patient records, we had to build something that runs entirely within hospital walls.
The Challenge
The requirements were clear but challenging:
- Zero external API calls — No OpenAI, no Anthropic, no cloud LLMs
- Multimodal capabilities — Process both text records and medical images
- HIPAA compliance — Encryption at rest, access controls, full audit logging
- Real-time performance — Clinicians can't wait 30 seconds for a response
- Easy to use — Doctors aren't engineers
The Architecture
Frontend: Next.js 15
We chose Next.js for its server-side rendering capabilities and the ability to deploy as a self-contained application within the hospital's private VPC. The UI needed to be intuitive — clinicians would use it between patient visits.
Backend: FastAPI
FastAPI handles the inference pipeline:
- Receives patient record queries from the frontend
- Orchestrates the local LLM calls
- Manages the vector search pipeline
- Enforces access controls and audit logging
Local LLMs: Ollama
Instead of external APIs, we host models locally via Ollama:
- Gemma 3 for text understanding and clinical note summarization
- Llava for medical image analysis (X-rays, charts, handwritten notes)
- Both models run on hospital-grade GPU hardware within the private network
Vector Search: ChromaDB
De-identified medical documents are indexed in ChromaDB for semantic search:
- Documents are chunked and embedded locally
- Retrieval augments LLM context for accurate, grounded responses
- No patient-identifying information in the vector store
Data Security: SQLCipher
Patient notes and session data are stored in SQLCipher-encrypted SQLite:
- AES-256 encryption at rest
- Key rotation managed by hospital IT
- Full CRUD audit logging for compliance
The HIPAA Risk Assessment
Building the technology was half the battle. We also conducted a full HIPAA risk assessment:
- Encryption at rest — All data encrypted with AES-256 via SQLCipher
- Encryption in transit — TLS 1.3 within the private VPC
- Access controls — Role-based access with hospital SSO integration
- Audit logging — Every query, every response, every access logged
- Data minimization — Only de-identified data in the vector store
- Breach notification — Automated alerting pipeline for anomalous access patterns
Results
- Zero patient data exposure — All inference runs locally, verified by third-party audit
- 30% reduction in chart-review time for physicians in the pilot study
- Real-time clinical suggestions with sub-3-second response times
- Featured in the company's internal Innovation Spotlight newsletter
Key Takeaways
- Local-first AI is viable — Modern SLMs run well on hospital-grade hardware
- Compliance is a feature — Document it, audit it, prove it
- Multimodal matters — Healthcare involves images, handwriting, and structured data
- User experience trumps everything — A clinician won't adopt a tool that slows them down
- Start with the constraints — HIPAA shaped every architectural decision for the better