🌍 Python SDK for Palabra AI's real-time speech-to-speech translation API
🚀 Break down language barriers and enable seamless communication across 25+ languages
🎯 The Palabra AI Python SDK provides a high-level API for integrating real-time speech-to-speech translation into your Python applications.
✨ What can Palabra.ai do?
- ⚡ Real-time speech-to-speech translation with near-zero latency
- 🎙️ Auto voice cloning - speak any language in YOUR voice
- 🔄 Two-way simultaneous translation for live discussions
- 🚀 Developer API/SDK for building your own apps
- 🎯 Works everywhere - Zoom, streams, events, any platform
- 🔒 Zero data storage - your conversations stay private
🔧 This SDK focuses on making real-time translation simple and accessible:
- 🛡️ Uses WebRTC and WebSockets under the hood
- ⚡ Abstracts away all complexity
- 🎮 Simple configuration with source/target languages
- 🎤 Supports multiple input/output adapters (microphones, speakers, files, buffers)
📊 How it works:
- 🎤 Configure input/output adapters
- 🔄 SDK handles the entire pipeline
- 🎯 Automatic transcription, translation, and synthesis
- 🔊 Real-time audio stream ready for playback
💡 All with just a few lines of code!
pip install palabra-ai
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
EN, ES, DeviceManager)
palabra = PalabraAI()
dm = DeviceManager()
mic, speaker = dm.select_devices_interactive()
cfg = Config(SourceLang(EN, mic), [TargetLang(ES, speaker)])
palabra.run(cfg)
⚙️ Set your API credentials as environment variables:
export PALABRA_API_KEY=your_api_key
export PALABRA_API_SECRET=your_api_secret
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
FileReader, FileWriter, EN, ES)
palabra = PalabraAI()
reader = FileReader("./speech/es.mp3")
writer = FileWriter("./es2en_out.wav")
cfg = Config(SourceLang(ES, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
FileReader, FileWriter, EN, ES, FR, DE)
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, FileReader("presentation.mp3")),
targets=[
TargetLang(ES, FileWriter("spanish.wav")),
TargetLang(FR, FileWriter("french.wav")),
TargetLang(DE, FileWriter("german.wav"))
]
)
palabra.run(config)
📋 Add a transcription of the source and translated speech.
⚙️ Configure output to provide:
- 🔊 Audio only
- 📝 Transcriptions only
- 🎯 Both audio and transcriptions
from palabra_ai import (
PalabraAI,
Config,
SourceLang,
TargetLang,
FileReader,
EN,
ES,
)
from palabra_ai.base.message import TranscriptionMessage
async def print_translation_async(msg: TranscriptionMessage):
print(repr(msg))
def print_translation(msg: TranscriptionMessage):
print(str(msg))
palabra = PalabraAI()
cfg = Config(
source=SourceLang(
EN,
FileReader("speech/en.mp3"),
print_translation # Callback for source transcriptions
),
targets=[
TargetLang(
ES,
# You can use only transcription without audio writer if you want
# FileWriter("./test_output.wav"), # Optional: audio output
on_transcription=print_translation_async # Callback for translated transcriptions
)
],
silent=True, # Set to True to disable verbose logging to console
)
palabra.run(cfg)
1️⃣ Audio only (default):
TargetLang(ES, FileWriter("output.wav"))
2️⃣ Transcription only:
TargetLang(ES, on_transcription=your_callback_function)
3️⃣ Audio and transcription:
TargetLang(ES, FileWriter("output.wav"), on_transcription=your_callback_function)
💡 The transcription callbacks receive TranscriptionMessage
objects containing the transcribed text and metadata.
🔄 Callbacks can be either synchronous or asynchronous functions.
import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
BufferReader, BufferWriter, AR, EN, RunAsPipe)
ffmpeg_cmd = [
'ffmpeg',
'-i', 'speech/ar.mp3',
'-f', 's16le', # 16-bit PCM
'-acodec', 'pcm_s16le',
'-ar', '48000', # 48kHz
'-ac', '1', # mono
'-' # output to stdout
]
pipe_buffer = RunAsPipe(ffmpeg_cmd)
es_buffer = io.BytesIO()
palabra = PalabraAI()
reader = BufferReader(pipe_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
f.write(es_buffer.getbuffer())
import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
BufferReader, BufferWriter, AR, EN)
from palabra_ai.internal.audio import convert_any_to_pcm16
en_buffer, es_buffer = io.BytesIO(), io.BytesIO()
with open("speech/ar.mp3", "rb") as f:
en_buffer.write(convert_any_to_pcm16(f.read()))
palabra = PalabraAI()
reader = BufferReader(en_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
f.write(es_buffer.getbuffer())
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, DeviceManager, EN, ES
dm = DeviceManager()
reader, writer = dm.get_default_readers_writers()
if reader and writer:
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, reader),
targets=[TargetLang(ES, writer)]
)
palabra.run(config)
import asyncio
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES
async def translate():
palabra = PalabraAI()
config = Config(
source=SourceLang(EN, FileReader("input.mp3")),
targets=[TargetLang(ES, FileWriter("output.wav"))]
)
await palabra.run(config)
asyncio.run(translate())
🎯 The Palabra AI SDK provides flexible I/O adapters that can combined to:
- 📁 FileReader/FileWriter: Read from and write to audio files
- 🎤 DeviceReader/DeviceWriter: Use microphones and speakers
- 💾 BufferReader/BufferWriter: Work with in-memory buffers
- 🔧 RunAsPipe: Run command and represent as pipe (e.g., FFmpeg stdout)
🔄 Combine any input adapter with any output adapter:
config = Config(
source=SourceLang(EN, mic),
targets=[TargetLang(ES, FileWriter("recording_es.wav"))]
)
config = Config(
source=SourceLang(EN, FileReader("presentation.mp3")),
targets=[TargetLang(ES, speaker)]
)
config = Config(
source=SourceLang(EN, mic),
targets=[
TargetLang(ES, speaker), # Play Spanish through speaker
TargetLang(ES, FileWriter("spanish.wav")), # Save Spanish to file
TargetLang(FR, FileWriter("french.wav")) # Save French to file
]
)
input_buffer = io.BytesIO(audio_data)
output_buffer = io.BytesIO()
config = Config(
source=SourceLang(EN, BufferReader(input_buffer)),
targets=[TargetLang(ES, BufferWriter(output_buffer))]
)
pipe = RunAsPipe(ffmpeg_process.stdout)
config = Config(
source=SourceLang(EN, BufferReader(pipe)),
targets=[TargetLang(ES, speaker)]
)
🎯 Translate audio streams in real-time with minimal latency
💬 Perfect for live conversations, conferences, and meetings
🎭 Preserve the original speaker's voice characteristics in translations
⚙️ Enable voice cloning in the configuration
🎤 Easy device selection with interactive prompts or programmatic access:
dm = DeviceManager()
# Interactive selection
mic, speaker = dm.select_devices_interactive()
# Get devices by name
mic = dm.get_mic_by_name("Blue Yeti")
speaker = dm.get_speaker_by_name("MacBook Pro Speakers")
# List all devices
input_devices = dm.get_input_devices()
output_devices = dm.get_output_devices()
🇸🇦 Arabic (AR), 🇨🇳 Chinese (ZH), 🇨🇿 Czech (CS), 🇩🇰 Danish (DA), 🇳🇱 Dutch (NL), 🇬🇧 English (EN), 🇫🇮 Finnish (FI), 🇫🇷 French (FR), 🇩🇪 German (DE), 🇬🇷 Greek (EL), 🇮🇱 Hebrew (HE), 🇭🇺 Hungarian (HU), 🇮🇹 Italian (IT), 🇯🇵 Japanese (JA), 🇰🇷 Korean (KO), 🇵🇱 Polish (PL), 🇵🇹 Portuguese (PT), 🇷🇺 Russian (RU), 🇪🇸 Spanish (ES), 🇹🇷 Turkish (TR), 🇺🇦 Ukrainian (UK)
🇸🇦 Arabic (AR), 🇧🇬 Bulgarian (BG), 🇨🇳 Chinese Mandarin (ZH), 🇨🇿 Czech (CS), 🇩🇰 Danish (DA), 🇳🇱 Dutch (NL), 🇬🇧 English UK (EN_GB), 🇺🇸 English US (EN_US), 🇫🇮 Finnish (FI), 🇫🇷 French (FR), 🇩🇪 German (DE), 🇬🇷 Greek (EL), 🇮🇱 Hebrew (HE), 🇭🇺 Hungarian (HU), 🇮🇩 Indonesian (ID), 🇮🇹 Italian (IT), 🇯🇵 Japanese (JA), 🇰🇷 Korean (KO), 🇵🇱 Polish (PL), 🇵🇹 Portuguese (PT), 🇧🇷 Portuguese Brazilian (PT_BR), 🇷🇴 Romanian (RO), 🇷🇺 Russian (RU), 🇸🇰 Slovak (SK), 🇪🇸 Spanish (ES), 🇲🇽 Spanish Mexican (ES_MX), 🇸🇪 Swedish (SV), 🇹🇷 Turkish (TR), 🇺🇦 Ukrainian (UK), 🇻🇳 Vietnamese (VN)
from palabra_ai import (
# English variants - 1.5+ billion speakers (including L2)
EN, EN_AU, EN_CA, EN_GB, EN_US,
# Chinese - 1.3+ billion speakers
ZH,
# Hindi - 600+ million speakers
HI,
# Spanish variants - 500+ million speakers
ES, ES_MX,
# Arabic variants - 400+ million speakers
AR, AR_AE, AR_SA,
# French variants - 280+ million speakers
FR, FR_CA,
# Portuguese variants - 260+ million speakers
PT, PT_BR,
# Russian - 260+ million speakers
RU,
# Japanese & Korean - 200+ million speakers combined
JA, KO,
# Southeast Asian languages - 400+ million speakers
ID, VN, TA, MS, FIL,
# Germanic languages - 150+ million speakers
DE, NL, SV, NO, DA,
# Other European languages - 300+ million speakers
TR, IT, PL, UK, RO, EL, HU, CS, BG, SK, FI, HR,
# Other languages - 40+ million speakers
AZ, HE
)
- ✅ Core SDK functionality
- ✅ GitHub Actions CI/CD
- ✅ Docker packaging
- ✅ Python 3.11, 3.12, 3.13 support
- ✅ PyPI publication (coming soon)
- ✅ Documentation site (coming soon)
- ⏳ Code coverage reporting (setup required)
- ⏳ TODO: global timeout support for long-running tasks
- ⏳ TODO: support for multiple source languages in a single run
- ⏳ TODO: fine cancelling on cancel_all_tasks()
- ⏳ TODO: error handling improvements
- 🧪 Tests: Running on Python 3.11, 3.12, 3.13
- 📦 Release: Automated releases with Docker images
- 📊 Coverage: Tests implemented, reporting setup needed
- 🐍 Python 3.11+
- 🔑 Palabra AI API credentials (get them at palabra.ai)
- 📚 Documentation: https://docs.palabra.ai
- 🐛 Issues: GitHub Issues
- 📧 Email: info@palabra.ai
This project is licensed under the MIT License - see the LICENSE file for details.
© Palabra.ai, 2025 | 🌍 Breaking down language barriers with AI 🚀