Capability

Real Time Transcription Streaming

20 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “streaming-audio-transcription”

automatic-speech-recognition model by undefined. 48,72,389 downloads.

Unique: Implements streaming via sliding-window inference on the full encoder-decoder model without requiring a separate streaming-optimized architecture. Uses overlapping chunks (30s windows with 5s overlap) and context stitching to maintain transcript coherence while processing audio incrementally.

vs others: Simpler to implement than streaming-specific models (e.g., Conformer-based streaming ASR) because it reuses the standard Whisper architecture; however, introduces higher latency (2-5s) and lower accuracy (1-3% degradation) compared to true streaming models optimized for low-latency inference.

Real Time Transcription Streaming

Top Matches

Also Known As

Company