Open
Description
Problem Overview
as the title, after VAD we get mixed speaker speech which are bad samples . If I want to distinguish and remove them as soon as posible from a very large speech dataset, How can I do that or is there any tool for this?
Steps Taken
(Detail your attempts to resolve the issue, including any relevant steps or processes.)
- Config/File changes: ...
- Run command: ...
- See errors: ...
Expected Outcome
(A clear and concise description of what you expected to happen.)
Screenshots
(If applicable, add screenshots to help explain your problem.)
Environment Information
- Operating System: [e.g. Ubuntu 20.04.5 LTS]
- Python Version: [e.g. Python 3.9.15]
- Driver & CUDA Version: [e.g. Driver 470.103.01 & CUDA 11.4]
- Error Messages and Logs: [If applicable, provide any error messages or relevant log outputs]
Additional context
(Add any other context about the problem here.)
Metadata
Metadata
Assignees
Labels
No labels