If the input is messy, the output will be too.
Voice Type normalizes loudness to a consistent target and applies a light high pass filter to reduce low frequency rumble. Combined with noise aware voice activity detection, this gives the model input closer to the audio it was trained on which leads to fewer garbles and better punctuation.
We avoid heavy “prompt fixes” that can make transcripts look confident but less faithful. Instead, we improve the signal before recognition.
Related articles
A straight look at punctuation and spacing. What we shipped to make it better, how to test it, and a few habits that help on macOS.
Why streaming on‑device and finalizing only the last ~30 seconds keeps long dictations responsive.
