Skip to content

[Team B] Sherpa AI backend #82

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 5, 2025
Merged

Conversation

FlyingWaffleDev
Copy link
Contributor

Initial pass at an AI speech-to-text backend for Yappy.

This is still missing speaker diarization (which I am working on). Will swap from draft to an actual pull request once I'm done.

Current code is missing the actual AI models, I'm still trying to optimize for size/efficacy.
If you want to test it yourself, though, you'll need to grab both a streaming and non-streaming ASR model from here.

Either grab the ones mentioned in the pubspec and drop them in assets as noted, or grab whichever you like and update the pubspec and the offline/online_model.dart files appropriately.

Once I get diarization working, you will also need a speaker recognition model and a speaker segmentation model.

@sasfha sasfha self-assigned this Feb 26, 2025
@FlyingWaffleDev
Copy link
Contributor Author

Been struggling with offline diarization, so I Just tried pushing the actual models for the speech-to-text. That push got rejected due to file size, so I'm going to go ahead and just mark this as ready for review. If anyone wants to test the speech-to-text, you will have to manually download the following and extract them into the assets folder:

https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17-mobile.tar.bz2

https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2

@FlyingWaffleDev FlyingWaffleDev force-pushed the teamb-yappy-sherpa-ai-backend branch from da78dbd to 4dd3d63 Compare February 28, 2025 00:58
@FlyingWaffleDev FlyingWaffleDev marked this pull request as ready for review February 28, 2025 01:03
@FlyingWaffleDev FlyingWaffleDev marked this pull request as draft March 1, 2025 23:43
@FlyingWaffleDev FlyingWaffleDev force-pushed the teamb-yappy-sherpa-ai-backend branch from 4dd3d63 to 52b69aa Compare March 2, 2025 20:14
@FlyingWaffleDev FlyingWaffleDev self-assigned this Mar 3, 2025
@FlyingWaffleDev FlyingWaffleDev linked an issue Mar 3, 2025 that may be closed by this pull request
@FlyingWaffleDev FlyingWaffleDev force-pushed the teamb-yappy-sherpa-ai-backend branch from 205b600 to 24a6449 Compare March 5, 2025 01:06
@FlyingWaffleDev
Copy link
Contributor Author

Progress:

  • AI functionality 'basically works' but is jank and flaky
  • AI models are now downloaded by the app, and not needed to be manually added to assets
  • Download feature is slightly buggy, it doesn't close properly when done and freezes while extracting files
  • Speech recognition needs cleanup, doing some things twice (like saving WAV)

Will hopefully have the kinks ironed out in the next few days.

@FlyingWaffleDev FlyingWaffleDev force-pushed the teamb-yappy-sherpa-ai-backend branch from 24a6449 to feee6de Compare March 5, 2025 17:17
Copy link
Contributor

@Z4sythe Z4sythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fire when ready. Just let us know if there's additional setup we need to do locally when you do.

@FlyingWaffleDev FlyingWaffleDev marked this pull request as ready for review March 5, 2025 23:04
@Stauntop-code Stauntop-code merged commit 0c5037b into developer Mar 5, 2025
1 check failed
@FlyingWaffleDev FlyingWaffleDev deleted the teamb-yappy-sherpa-ai-backend branch March 16, 2025 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Team B] Implement speaker identification & voice segmentation
4 participants