-
Notifications
You must be signed in to change notification settings - Fork 780
Support Portuguese and German ASR models from NeMo #2394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Caution Review failedThe pull request is closed. WalkthroughThe updates introduce support for new Portuguese and German FastConformer hybrid models in both CTC and transducer variants, including int8 quantized versions. Workflow scripts, metadata handling, and model configuration logic are updated to accommodate these models. New scripts automate export, organization, and testing for these additions, and file tracking is expanded to cover audio files. Changes
Sequence Diagram(s)sequenceDiagram
participant Workflow
participant Script
participant Exporter
participant Organizer
participant Tester
Workflow->>Script: Trigger run-ctc-non-streaming-2.sh / run-transducer-non-streaming-2.sh
Script->>Exporter: Export ONNX model (Portuguese/German, CTC/Transducer, int8)
Exporter-->>Script: ONNX model files, tokens
Script->>Organizer: Move files to model-specific directories
Script->>Organizer: Download and place test WAV files
Script->>Tester: Run Python test script on int8 model with test audio
Tester-->>Script: Output inference results
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (9)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
A concise description of the purpose of the PR, followed by summarized bullets of changes
- Add support for Portuguese and German ASR models in both transducer and CTC modes.
- Introduce new helper scripts and update export logic for ONNX metadata URLs.
- Update documentation, APK generation script, and CI workflows to include the new models.
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
sherpa-onnx/kotlin-api/OfflineRecognizer.kt | Added support for pt/de model indices 35–38 with corresponding transducer and CTC configurations. |
scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-transducer-non-streaming-2.sh | New export and test script for non-streaming transducer models (pt & de). |
scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-ctc-non-streaming-2.sh | New export and test script for non-streaming CTC models (pt & de). |
scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-transducer-non-streaming.py | Conditional metadata URL logic added for huggingface vs NGC. |
scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-ctc-non-streaming.py | Same conditional URL logic for the CTC export. |
scripts/nemo/fast-conformer-hybrid-transducer-ctc/README.md | Documented the new Portuguese and German model URLs. |
scripts/apk/generate-vad-asr-apk-script.py | Extended APK-generation script to include the new pt/de models. |
.github/workflows/export-nemo-fast-conformer-hybrid-transducer-transducer-non-streaming.yaml | Updated workflow to run new export script and add new model names. |
.github/workflows/export-nemo-fast-conformer-hybrid-transducer-ctc-non-streaming.yaml | Similarly updated workflow for CTC export. |
Comments suppressed due to low confidence (1)
.github/workflows/export-nemo-fast-conformer-hybrid-transducer-transducer-non-streaming.yaml:83
- [nitpick] Including
*.wav
in LFS tracking will push test audio files to the remote. If these are only for CI testing, consider excluding them to avoid bloating the repository.
git lfs track "*.onnx" "*.wav"
scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-transducer-non-streaming.py
Outdated
Show resolved
Hide resolved
scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-ctc-non-streaming.py
Outdated
Show resolved
Hide resolved
scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-ctc-non-streaming-2.sh
Outdated
Show resolved
Hide resolved
…transducer-non-streaming.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ctc-non-streaming.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…streaming-2.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
You can try them at
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition
or download pre-built APKs and run them on your Android devices.
Summary by CodeRabbit
New Features
Chores