-
Notifications
You must be signed in to change notification settings - Fork 2
[Team B] Sherpa AI backend #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Been struggling with offline diarization, so I Just tried pushing the actual models for the speech-to-text. That push got rejected due to file size, so I'm going to go ahead and just mark this as ready for review. If anyone wants to test the speech-to-text, you will have to manually download the following and extract them into the assets folder: |
da78dbd
to
4dd3d63
Compare
4dd3d63
to
52b69aa
Compare
205b600
to
24a6449
Compare
Progress:
Will hopefully have the kinks ironed out in the next few days. |
will implement more of the required ai stuff in later commits
24a6449
to
feee6de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fire when ready. Just let us know if there's additional setup we need to do locally when you do.
Initial pass at an AI speech-to-text backend for Yappy.
This is still missing speaker diarization (which I am working on). Will swap from draft to an actual pull request once I'm done.
Current code is missing the actual AI models, I'm still trying to optimize for size/efficacy.
If you want to test it yourself, though, you'll need to grab both a streaming and non-streaming ASR model from here.
Either grab the ones mentioned in the pubspec and drop them in assets as noted, or grab whichever you like and update the pubspec and the offline/online_model.dart files appropriately.
Once I get diarization working, you will also need a speaker recognition model and a speaker segmentation model.