Implementation Details

This section documents implementation details that affect perceived responsiveness.

Making batch models feel real-time.

Prepending silence to prevent hallucinations.

Using atomics for metrics instead of mutexes.

Why we use std::thread instead of tokio::spawn for inference.

Resampling and AGC for consistent input quality.

Detecting when Zoom/Teams is running.

Keyboard shortcuts