Palabra AI Python SDK

🌍 Python SDK for Palabra AI's real-time speech-to-speech translation API
🚀 Break down language barriers and enable seamless communication across 25+ languages

Overview 📋

🎯 The Palabra AI Python SDK provides a high-level API for integrating real-time speech-to-speech translation into your Python applications.

✨ What can Palabra.ai do?

⚡ Real-time speech-to-speech translation with near-zero latency
🎙️ Auto voice cloning - speak any language in YOUR voice
🔄 Two-way simultaneous translation for live discussions
🚀 Developer API/SDK for building your own apps
🎯 Works everywhere - Zoom, streams, events, any platform
🔒 Zero data storage - your conversations stay private

🔧 This SDK focuses on making real-time translation simple and accessible:

🛡️ Uses WebRTC and WebSockets under the hood
⚡ Abstracts away all complexity
🎮 Simple configuration with source/target languages
🎤 Supports multiple input/output adapters (microphones, speakers, files, buffers)

📊 How it works:

🎤 Configure input/output adapters
🔄 SDK handles the entire pipeline
🎯 Automatic transcription, translation, and synthesis
🔊 Real-time audio stream ready for playback

💡 All with just a few lines of code!

Installation 📦

From PyPI 📦

pip install palabra-ai

Quick Start 🚀

Real-time microphone translation 🎤

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        EN, ES, DeviceManager)

palabra = PalabraAI()
dm = DeviceManager()
mic, speaker = dm.select_devices_interactive()
cfg = Config(SourceLang(EN, mic), [TargetLang(ES, speaker)])
palabra.run(cfg)

⚙️ Set your API credentials as environment variables:

export PALABRA_API_KEY=your_api_key
export PALABRA_API_SECRET=your_api_secret

Examples 💡

File-to-file translation 📁

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        FileReader, FileWriter, EN, ES)

palabra = PalabraAI()
reader = FileReader("./speech/es.mp3")
writer = FileWriter("./es2en_out.wav")
cfg = Config(SourceLang(ES, reader), [TargetLang(EN, writer)])
palabra.run(cfg)

Multiple target languages 🌐

from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        FileReader, FileWriter, EN, ES, FR, DE)

palabra = PalabraAI()
config = Config(
    source=SourceLang(EN, FileReader("presentation.mp3")),
    targets=[
        TargetLang(ES, FileWriter("spanish.wav")),
        TargetLang(FR, FileWriter("french.wav")),
        TargetLang(DE, FileWriter("german.wav"))
    ]
)
palabra.run(config)

Customizable output 📝

📋 Add a transcription of the source and translated speech.
⚙️ Configure output to provide:

🔊 Audio only
📝 Transcriptions only
🎯 Both audio and transcriptions

from palabra_ai import (
    PalabraAI,
    Config,
    SourceLang,
    TargetLang,
    FileReader,
    EN,
    ES,
)
from palabra_ai.base.message import TranscriptionMessage


async def print_translation_async(msg: TranscriptionMessage):
    print(repr(msg))

def print_translation(msg: TranscriptionMessage):
    print(str(msg))

palabra = PalabraAI()
cfg = Config(
    source=SourceLang(
        EN,
        FileReader("speech/en.mp3"),
        print_translation  # Callback for source transcriptions
    ),
    targets=[
        TargetLang(
            ES,
            # You can use only transcription without audio writer if you want
            # FileWriter("./test_output.wav"),  # Optional: audio output
            on_transcription=print_translation_async  # Callback for translated transcriptions
        )
    ],
    silent=True,  # Set to True to disable verbose logging to console
)
palabra.run(cfg)

Transcription output options: 📊

1️⃣ Audio only (default):

TargetLang(ES, FileWriter("output.wav"))

2️⃣ Transcription only:

TargetLang(ES, on_transcription=your_callback_function)

3️⃣ Audio and transcription:

TargetLang(ES, FileWriter("output.wav"), on_transcription=your_callback_function)

💡 The transcription callbacks receive TranscriptionMessage objects containing the transcribed text and metadata.
🔄 Callbacks can be either synchronous or asynchronous functions.

Integrate with FFmpeg (streaming) 🎬

import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        BufferReader, BufferWriter, AR, EN, RunAsPipe)

ffmpeg_cmd = [
    'ffmpeg',
    '-i', 'speech/ar.mp3',
    '-f', 's16le',      # 16-bit PCM
    '-acodec', 'pcm_s16le',
    '-ar', '48000',     # 48kHz
    '-ac', '1',         # mono
    '-'                 # output to stdout
]

pipe_buffer = RunAsPipe(ffmpeg_cmd)
es_buffer = io.BytesIO()

palabra = PalabraAI()
reader = BufferReader(pipe_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)

print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
    f.write(es_buffer.getbuffer())

Using buffers 💾

import io
from palabra_ai import (PalabraAI, Config, SourceLang, TargetLang,
                        BufferReader, BufferWriter, AR, EN)
from palabra_ai.internal.audio import convert_any_to_pcm16

en_buffer, es_buffer = io.BytesIO(), io.BytesIO()
with open("speech/ar.mp3", "rb") as f:
    en_buffer.write(convert_any_to_pcm16(f.read()))
palabra = PalabraAI()
reader = BufferReader(en_buffer)
writer = BufferWriter(es_buffer)
cfg = Config(SourceLang(AR, reader), [TargetLang(EN, writer)])
palabra.run(cfg)
print(f"Translated audio written to buffer with size: {es_buffer.getbuffer().nbytes} bytes")
with open("./ar2en_out.wav", "wb") as f:
    f.write(es_buffer.getbuffer())

Using default audio devices 🔊

from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, DeviceManager, EN, ES

dm = DeviceManager()
reader, writer = dm.get_default_readers_writers()

if reader and writer:
    palabra = PalabraAI()
    config = Config(
        source=SourceLang(EN, reader),
        targets=[TargetLang(ES, writer)]
    )
    palabra.run(config)

Async API ⚡

import asyncio
from palabra_ai import PalabraAI, Config, SourceLang, TargetLang, FileReader, FileWriter, EN, ES

async def translate():
    palabra = PalabraAI()
    config = Config(
        source=SourceLang(EN, FileReader("input.mp3")),
        targets=[TargetLang(ES, FileWriter("output.wav"))]
    )
    await palabra.run(config)

asyncio.run(translate())

I/O Adapters & Mixing 🔌

Available adapters 🛠️

🎯 The Palabra AI SDK provides flexible I/O adapters that can combined to:

📁 FileReader/FileWriter: Read from and write to audio files
🎤 DeviceReader/DeviceWriter: Use microphones and speakers
💾 BufferReader/BufferWriter: Work with in-memory buffers
🔧 RunAsPipe: Run command and represent as pipe (e.g., FFmpeg stdout)

Mixing examples 🎨

🔄 Combine any input adapter with any output adapter:

🎤➡️📁 Microphone to file - record translations

config = Config(
    source=SourceLang(EN, mic),
    targets=[TargetLang(ES, FileWriter("recording_es.wav"))]
)

📁➡️🔊 File to speaker - play translations

config = Config(
    source=SourceLang(EN, FileReader("presentation.mp3")),
    targets=[TargetLang(ES, speaker)]
)

🎤➡️🔊📁 Microphone to multiple outputs

config = Config(
    source=SourceLang(EN, mic),
    targets=[
        TargetLang(ES, speaker),  # Play Spanish through speaker
        TargetLang(ES, FileWriter("spanish.wav")),  # Save Spanish to file
        TargetLang(FR, FileWriter("french.wav"))    # Save French to file
    ]
)

💾➡️💾 Buffer to buffer - for integration

input_buffer = io.BytesIO(audio_data)
output_buffer = io.BytesIO()

config = Config(
    source=SourceLang(EN, BufferReader(input_buffer)),
    targets=[TargetLang(ES, BufferWriter(output_buffer))]
)

🔧➡️🔊 FFmpeg pipe to speaker

pipe = RunAsPipe(ffmpeg_process.stdout)
config = Config(
    source=SourceLang(EN, BufferReader(pipe)),
    targets=[TargetLang(ES, speaker)]
)

Features ✨

Real-time translation ⚡

🎯 Translate audio streams in real-time with minimal latency
💬 Perfect for live conversations, conferences, and meetings

Voice cloning 🗣️

🎭 Preserve the original speaker's voice characteristics in translations
⚙️ Enable voice cloning in the configuration

Device management 🎮

🎤 Easy device selection with interactive prompts or programmatic access:

dm = DeviceManager()

# Interactive selection
mic, speaker = dm.select_devices_interactive()

# Get devices by name
mic = dm.get_mic_by_name("Blue Yeti")
speaker = dm.get_speaker_by_name("MacBook Pro Speakers")

# List all devices
input_devices = dm.get_input_devices()
output_devices = dm.get_output_devices()

Supported languages 🌍

Speech recognition languages 🎤

🇸🇦 Arabic (AR), 🇨🇳 Chinese (ZH), 🇨🇿 Czech (CS), 🇩🇰 Danish (DA), 🇳🇱 Dutch (NL), 🇬🇧 English (EN), 🇫🇮 Finnish (FI), 🇫🇷 French (FR), 🇩🇪 German (DE), 🇬🇷 Greek (EL), 🇮🇱 Hebrew (HE), 🇭🇺 Hungarian (HU), 🇮🇹 Italian (IT), 🇯🇵 Japanese (JA), 🇰🇷 Korean (KO), 🇵🇱 Polish (PL), 🇵🇹 Portuguese (PT), 🇷🇺 Russian (RU), 🇪🇸 Spanish (ES), 🇹🇷 Turkish (TR), 🇺🇦 Ukrainian (UK)

Translation languages 🔄

🇸🇦 Arabic (AR), 🇧🇬 Bulgarian (BG), 🇨🇳 Chinese Mandarin (ZH), 🇨🇿 Czech (CS), 🇩🇰 Danish (DA), 🇳🇱 Dutch (NL), 🇬🇧 English UK (EN_GB), 🇺🇸 English US (EN_US), 🇫🇮 Finnish (FI), 🇫🇷 French (FR), 🇩🇪 German (DE), 🇬🇷 Greek (EL), 🇮🇱 Hebrew (HE), 🇭🇺 Hungarian (HU), 🇮🇩 Indonesian (ID), 🇮🇹 Italian (IT), 🇯🇵 Japanese (JA), 🇰🇷 Korean (KO), 🇵🇱 Polish (PL), 🇵🇹 Portuguese (PT), 🇧🇷 Portuguese Brazilian (PT_BR), 🇷🇴 Romanian (RO), 🇷🇺 Russian (RU), 🇸🇰 Slovak (SK), 🇪🇸 Spanish (ES), 🇲🇽 Spanish Mexican (ES_MX), 🇸🇪 Swedish (SV), 🇹🇷 Turkish (TR), 🇺🇦 Ukrainian (UK), 🇻🇳 Vietnamese (VN)

Available language constants 📚

from palabra_ai import (
    # English variants - 1.5+ billion speakers (including L2)
    EN, EN_AU, EN_CA, EN_GB, EN_US,

    # Chinese - 1.3+ billion speakers
    ZH,

    # Hindi - 600+ million speakers
    HI,

    # Spanish variants - 500+ million speakers
    ES, ES_MX,

    # Arabic variants - 400+ million speakers
    AR, AR_AE, AR_SA,

    # French variants - 280+ million speakers
    FR, FR_CA,

    # Portuguese variants - 260+ million speakers
    PT, PT_BR,

    # Russian - 260+ million speakers
    RU,

    # Japanese & Korean - 200+ million speakers combined
    JA, KO,

    # Southeast Asian languages - 400+ million speakers
    ID, VN, TA, MS, FIL,

    # Germanic languages - 150+ million speakers
    DE, NL, SV, NO, DA,

    # Other European languages - 300+ million speakers
    TR, IT, PL, UK, RO, EL, HU, CS, BG, SK, FI, HR,

    # Other languages - 40+ million speakers
    AZ, HE
)

Development status 🛠️

Current status ✅

✅ Core SDK functionality
✅ GitHub Actions CI/CD
✅ Docker packaging
✅ Python 3.11, 3.12, 3.13 support
✅ PyPI publication (coming soon)
✅ Documentation site (coming soon)
⏳ Code coverage reporting (setup required)

Current dev roadmap 🗺️

⏳ TODO: global timeout support for long-running tasks
⏳ TODO: support for multiple source languages in a single run
⏳ TODO: fine cancelling on cancel_all_tasks()
⏳ TODO: error handling improvements

Build status 🏗️

🧪 Tests: Running on Python 3.11, 3.12, 3.13
📦 Release: Automated releases with Docker images
📊 Coverage: Tests implemented, reporting setup needed

Requirements 📋

🐍 Python 3.11+
🔑 Palabra AI API credentials (get them at palabra.ai)

Support 🤝

📚 Documentation: https://docs.palabra.ai
🐛 Issues: GitHub Issues
📧 Email: info@palabra.ai

License 📄

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.github		.github
docs		docs
examples		examples
src		src
tests		tests
.codecov.yml		.codecov.yml
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
RELEASE.md		RELEASE.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

PalabraAI/palabra-ai-python

Folders and files

Latest commit

History

Repository files navigation