Long sessions: uploads vs 30‑second windowing - Voice Type blog Skip to main content Voice Type Pricing Learn Enterprise Trust Blog Blog Long sessions: uploads vs 30‑second windowing Why streaming on‑device and finalizing only the last ~30 seconds keeps long dictations responsive. ← Back to Blog | Home 30 Sept 2025 Half an hour of clean audio is not a “quick upload.” Uploading long, high‑quality audio takes time, especially on variable wifi. Many cloud tools avoid heavy compression to protect accuracy, which increases upload size. Voice Type stays on‑device, streams continuously, and when you stop, finishes only the last ~30s window (≈2–3s on an M1). That’s why long sessions feel snappy in practice. Explore the difference (choose Medium or Long in the demo): /blog/latency-demo Previous Cleaner input, cleaner transcripts: audio conditioning for accuracy Next Offline vs Cloud Dictation on macOS - A Practical Guide Related articles Product Punctuation that sticks: what we fixed and how to get best results A straight look at punctuation and spacing. What we shipped to make it better, how to test it, and a few habits that help on macOS. Engineering Cleaner input, cleaner transcripts: audio conditioning for accuracy Normalized loudness and gentle filtering help the recognizer hear what you meant, not the room. Voice Type Learn All guides Voice Type vs Apple Dictation Dragon alternatives For writers For developers Notion on Mac Latency demo Press kit Company Enterprise Trust Center Pricing Blog Company Terms of service Privacy policy Contact us © 2025 Careless Whisper Inc.