Implementation Details
This section documents implementation details that affect perceived responsiveness.
Simulated Streaming
Making batch models feel real-time.
Silence Injection
Prepending silence to prevent hallucinations.
Lock-Free Metrics
Using atomics for metrics instead of mutexes.
Threading Model
Why we use std::thread instead of tokio::spawn for inference.
Audio Hygiene
Resampling and AGC for consistent input quality.
Meeting Detection
Detecting when Zoom/Teams is running.