LocalSpeech is a project that sets up and runs AI-powered speech models with just one command on MacOS. All voice models are setup in openai client sdk format. This guide provides step-by-step instructions to install all dependencies, set up the environment, and run both backend and playground services.
Ensure you have the following installed before proceeding:
If you are using macOS and don't have Homebrew installed, run:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Miniconda (recommended for lightweight installation):
curl -fsSL https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh | bash
Restart your terminal and initialize Conda:
conda init zsh
For Linux users, replace MacOSX-x86_64
with Linux-x86_64
in the download link.
You can install Node.js and npm via Homebrew (macOS) or a package manager:
brew install node
For Linux:
sudo apt install nodejs npm -y
brew install ffmpeg # macOS
sudo apt install ffmpeg -y # Linux
To set up and run a speech model backend, execute the following command:
chmod+x master-script.sh
./master-script.sh
This script:
- Prompts you to select a model.
- Clones the corresponding repository.
- Creates and activates a Conda environment.
- Installs dependencies.
- Starts the model API server in an openai compliant format
To set up and run the playground service, execute:
chmod +x run-frontend.sh
./run-frontend.sh
This script:
- Navigates to the
frontend
directory. - Installs dependencies using
npm install
. - Starts the development server using
npm run dev
.
- Whisper Speech Recognition
- Kokoro TTS
- Spark TTS
- Zonos TTS
- Sesame CSM
- If the backend folder exists but the Conda environment does not, the script will delete the folder and re-clone it.
- Make sure to activate the Conda environment before running the backend manually:
conda activate <env_name>
- If you encounter permission issues, try running the scripts with
chmod +x script_name.sh
and then executing them. - Implement the APIs in your own apps by referring the
./frontend/app/page.tsx
page to get idea about the api.
- Conda command not found: Restart your terminal and run
conda init zsh
again. - ffmpeg missing: Ensure it's installed with
brew install ffmpeg
orsudo apt install ffmpeg -y
. - Port conflicts: If the backend or frontend fails to start, ensure no other service is running on the same port.
This README provides all necessary steps to get LocalSpeech up and running. If you encounter any issues, feel free to report them!