A professional AI-powered chatbot that provides intelligent responses using YouTube video content as a knowledge base, with advanced Text-to-Speech capabilities. Built with modern RAG (Retrieval-Augmented Generation) architecture and deployed on Streamlit Cloud.
Status: Currently deployed and running on Streamlit Cloud
Note: This is a private deployment. To run your instance, follow the setup instructions below.
- Intelligent Content Retrieval: Semantic search through YouTube video transcripts using BGE-M3 embeddings
- Multi-source Intelligence: Combines YouTube knowledge base with real-time web search fallback
- Confidence-based Responses: LLM evaluates response quality and automatically falls back to web search when needed
- Source Attribution: Every response includes video source, confidence score, and clickable YouTube links
- Professional Voice Synthesis: ElevenLabs API integration with multilingual support
- One-click Audio: Generate speech for any response with a single button
- Optimized Performance: Efficient audio streaming and caching
- Turkish & English: Automatic language detection and appropriate response generation
- Language-aware Processing: Matches query language with content language for optimal results
- Example Questions: Quick access to common leadership and business queries
- Conversation History: Export chat sessions for future reference
- Professional UI: Clean, modern interface with dark theme
- Real-time Processing: Fast response generation with progress indicators
YouTube Playlist → Audio Download → Transcription → Vector Store → RAG → TTS → Web Interface
↓ ↓ ↓ ↓ ↓ ↓ ↓
pytubefix OpenAI Whisper BGE-M3 Qdrant Gemini AI Elevenlabs Streamlit
↓
Web Search Fallback
↓
DuckDuckGo API
- Query Processing: Language detection and intent analysis
- Vector Search: Semantic similarity search in video transcripts
- Content Retrieval: Extract best matching video content
- Response Generation: Create contextual answer using Gemini AI
- Quality Evaluation: LLM confidence scoring (0.0-1.0)
- Fallback Logic: Web search if confidence < 0.5
- Source Attribution: Add video links and confidence scores
youtube-rag-assistant/
├── app.py # Main Streamlit application
├── requirements.txt # Python dependencies
├── .env.template # Environment variables template
├── Dockerfile # Container configuration
├── config/
│ ├── settings.yaml # Application configuration
│ └── prompts.yaml # LLM prompt templates
├── src/
│ ├── core/
│ │ ├── config.py # Configuration management
│ │ └── models.py # Data models and types
│ └── services/
│ ├── youtube_service.py # YouTube video downloading
│ ├── transcription_service.py # Audio transcription
│ ├── vector_service.py # Vector search and embeddings
│ ├── rag_service.py # RAG implementation
│ ├── tts_service.py # Text-to-Speech service
│ └── web_search_service.py # Web search fallback
├── data/ # Data storage (gitignored)
│ ├── audio/ # Downloaded audio files
│ ├── transcripts/ # Individual transcript files
│ ├── vector_db/ # Qdrant vector database
│ └── transcripts.json # Video metadata
└── images/ # Screenshots and assets
- Python 3.11+: Main programming language
- Streamlit: Interactive web application framework
- Google Gemini AI: Advanced language model for response generation
- LangChain: RAG framework and document processing
- Qdrant: High-performance vector database
- HuggingFace Transformers: BGE-M3 multilingual embeddings
- OpenAI Whisper: Audio transcription with multilingual support
- ElevenLabs API: Professional text-to-speech synthesis
- BGE-M3: State-of-the-art multilingual embedding model
- Sentence Transformers: Text similarity and semantic search
- pytubefix: YouTube video downloading and metadata extraction
- PyYAML: Configuration management
- Pydantic: Data validation and type safety
- Streamlit Cloud: Production deployment
- Docker: Containerization support
- GitHub Actions: CI/CD pipeline ready
git clone https://github.com/ezgisubasi/youtube-rag-assistant.git
cd youtube-rag-assistant
pip install -r requirements.txt
# Copy environment template
cp .env.template .env
# Add your API keys
export GEMINI_API_KEY="your_gemini_api_key_here"
export HF_TOKEN="your_huggingface_api_key_here"
export ELEVENLABS_API_KEY="your_elevenlabs_api_key_here" # Optional for TTS
streamlit run app.py
The application will be available at http://localhost:8501
# Required
GEMINI_API_KEY=your_gemini_api_key_here
HF_TOKEN=your_huggingface_api_key_here
# Optional (for TTS features)
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
# Optional (override default playlist)
YOUTUBE_PLAYLIST_URL=https://youtube.com/playlist?list=your_playlist_id
# AI Model Configuration
model_name: gemini-2.0-flash-exp
embedding_model: altaidevorg/bge-m3-distill-8l
# Vector Database
vector_db_path: data/vector_db
collection_name: youtube_transcripts
retrieval_k: 3
similarity_threshold: 0.7
# Processing Settings
whisper_model: medium
language: tr
# File Paths
data_dir: data
transcripts_json: data/transcripts.json
If you want to process your own YouTube content:
python src/services/youtube_service.py
python src/services/transcription_service.py
python src/services/vector_service.py
from src.services.vector_service import VectorService
service = VectorService()
service.initialize_vector_store()
results = service.search("leadership strategies")
for result in results:
print(f"{result.video_title}: {result.similarity_score:.3f}")
from src.services.rag_service import RAGService
# Initialize RAG service
rag_service = RAGService()
# Test queries in both languages
test_queries = [
"Nasıl etkili lider olunur?",
"How to become a successful entrepreneur?",
"Takım motivasyonu stratejileri nelerdir?",
"What are the best business strategies?"
]
# Test each query
for query in test_queries:
print(f"\nQuery: {query}")
response = rag_service.generate_response(query)
print(f"Answer: {response.answer}")
print(f"Confidence: {response.confidence_score:.3f}")
print(f"Sources: {len(response.sources)}")
# Show source details
for source in response.sources:
print(f" - {source.video_title}: {source.similarity_score:.3f}")
print(f" URL: {source.video_url}")
## Advanced Features
### Confidence-Based Response System
The system evaluates each response using LLM confidence scoring:
- **High Confidence (≥0.5)**: Uses YouTube knowledge base response
- **Low Confidence (<0.5)**: Automatically falls back to web search
- **Transparent Scoring**: Shows confidence level for each response
### Multi-Source Intelligence
```python
# RAG Flow Example
1. Search YouTube transcripts → Generate response
2. LLM evaluates response quality → Confidence score
3. If low confidence → Web search fallback
4. Return best response with source attribution
- Voice Quality: ElevenLabs multilingual voice synthesis
- Performance: Optimized audio streaming
- Accessibility: One-click speech generation
- Fallback: Graceful degradation when TTS unavailable
- Push code to GitHub repository
- Connect to Streamlit Cloud
- Set secrets in Streamlit dashboard:
GEMINI_API_KEY = "your_api_key" HF_TOKEN="your_huggingface_api_key" ELEVENLABS_API_KEY = "your_elevenlabs_key"
- Deploy automatically
# Build image
docker build -t youtube-rag-assistant .
# Run container
docker run -p 8501:8501 \
-e GEMINI_API_KEY="your_key" \
-e HF_TOKEN="your_huggingface_api_key" \
-e ELEVENLABS_API_KEY="your_elevenlabs_key" \
youtube-rag-assistant
# Development mode with auto-reload
streamlit run app.py --server.runOnSave true
- YouTube Knowledge: High-quality responses from curated content
- Web Fallback: Real-time information when knowledge base is insufficient
- Confidence Scoring: Transparent quality metrics for each response
- Vector Search: ~200ms for semantic similarity search
- Response Generation: ~1-3s with Gemini AI
- TTS Generation: ~2-5s for speech synthesis
- Total Response Time: ~3-8s end-to-end
- Source Attribution: Every response linked to original video
- Language Matching: Queries matched with appropriate language content
- Context Relevance: Advanced embedding models for semantic understanding
- Environment variable-based configuration
- No hardcoded secrets in repository
- Streamlit secrets integration for cloud deployment
- Local vector database storage
- No user query logging
- Temporary audio processing only
- Vector Search Failure: Graceful degradation to web search
- API Timeouts: Retry logic with exponential backoff
- TTS Unavailable: Silent fallback without breaking functionality
- Configuration Errors: Clear error messages and recovery suggestions
- Vector database connectivity
- API key validation
- Model availability status
- Response quality metrics
- Response time monitoring
- Confidence score distribution
- Source attribution statistics
- TTS usage patterns
# Clone repository
git clone https://github.com/ezgisubasi/youtube-rag-assistant.git
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install development dependencies
pip install -r requirements.txt
- Type Hints: All functions include type annotations
- Documentation: Comprehensive docstrings for all modules
- Error Handling: Graceful error handling with user-friendly messages
- Testing: Unit tests for all core functionality
MIT License - see LICENSE file for details.
- OpenAI Whisper: High-quality multilingual transcription
- Google Gemini: Advanced language model capabilities
- ElevenLabs: Professional text-to-speech synthesis
- Streamlit: Rapid web application development
- LangChain: Comprehensive RAG framework
- Qdrant: High-performance vector database
For issues, questions, or contributions:
- Bug Reports: GitHub Issues
- Feature Requests: GitHub Discussions
- Contact: ezgisubasi1998@gmail.com
Built with modern AI technologies and best practices.