PerformanceSep 30, 20251 min readfreshly reviewed

Short utterances and the hidden cost of handshakes

For 5–15 second notes, network setup time can outweigh everything else. On‑device avoids the detours.

You say “Thanks.” The network says “Hold on.”

TL;DR

For short clips, DNS/TLS and upload setup can be most of the wait.
On-device avoids the handshake tax entirely.
The “feel” of dictation is dominated by stop-to-text latency.

Cloud flows typically include multiple handshakes (TLS/DNS) and at least one remote hop. For very short phrases, this setup time can dominate. On‑device dictation avoids the detours entirely: your audio stays local, text appears immediately, and there’s nothing to upload.

See the effect in the interactive demo (choose Short and try different networks): /blog/latency-demo

Performance

Long sessions: uploads vs 30‑second windowing

Why streaming on‑device and finalizing only the last ~30 seconds keeps long dictations responsive.

Product

Best dictation app for Mac in 2026: what actually matters

A practical buyer's guide to Mac dictation in 2026: which tools fit quick notes, full hands-free control, private local workflows, and file transcription.

Short utterances and the hidden cost of handshakes

TL;DR

Dictate into any Mac text field without waiting on uploads.

Related articles