Bug or a feature in streaming_utils.py BatchedFrameASRTDT? Is it possible to run parakeet-tdt-0.6b-v2 in streaming mode? #13763
Replies: 3 comments
-
It is difficult to do so. But possible. Someone has converted to lightweight integer models https://github.com/k2-fsa/sherpa-onnx/tree/master/scripts/nemo/parakeet-tdt-0.6b-v2 . I made an example https://github.com/deepanshu-yadav/voice-form-filler on how to wrap this around a server and send audio packets and return the final inference. It works okayish on my laptop without any GPU. |
Beta Was this translation helpful? Give feedback.
-
There is a bug in the script |
Beta Was this translation helpful? Give feedback.
-
Definitely possible, without any GPUs, on your phones. You can find a pre-built Android APK for this model at More APKs are available at Everything is open-sourced at |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
In this huggingface discussion https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2/discussions/3
It is said that its possible to use parakeet tdt in streaming mode and an inference bug was detected and corrected. I run
this script but get empty strings as prediction.
# Evaluate Parakeet in streaming mode python examples/asr/asr_chunked_inference/rnnt/speech_to_text_buffered_infer_rnnt.py \ model_path="parakeet_downlaoded/parakeet-tdt-0.6b-v2.nemo" \ audio_dir="audio_dir" \ output_filename="output.json" \ chunk_len_in_secs=0.2 \ total_buffer_in_secs=0.8 \ model_stride=4 \ batch_size=1
Pulled from main, commit=259d684e73c45091f0b6144342133e6ceb7e824c installed with pip install '.[all]'
Beta Was this translation helpful? Give feedback.
All reactions