Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Implementation Details

This section documents implementation details that affect perceived responsiveness.

Simulated Streaming

Making batch models feel real-time.

Silence Injection

Prepending silence to prevent hallucinations.

Lock-Free Metrics

Using atomics for metrics instead of mutexes.

Threading Model

Why we use std::thread instead of tokio::spawn for inference.

Audio Hygiene

Resampling and AGC for consistent input quality.

Meeting Detection

Detecting when Zoom/Teams is running.