Architecture
The TTS Docker deployment consists of four main services that work together:API Server
The API Server is the main entry point for all client requests.Purpose
- Routes incoming API requests to Lightning TTS workers
- Manages WebSocket connections for streaming
- Handles request queuing and load balancing
- Provides unified API interface
Container Details
quay.io/smallestinc/self-hosted-api-server:latest7100 - Main API endpoint- CPU: 0.5-2 cores
- Memory: 512 MB - 2 GB
- No GPU required
Key Endpoints
Environment Variables
Logs
Key log messages:Dependencies
- Requires Lightning TTS to be running
- Requires License Proxy for validation
- Optionally uses Redis for request coordination
Lightning TTS
The core text-to-speech engine powered by GPU acceleration.Purpose
- Converts text to high-quality speech audio
- Processes both batch and streaming requests
- Manages GPU resources and model inference
- Handles voice synthesis and audio generation
Container Details
quay.io/smallestinc/lightning-tts:latest8876 - TTS service endpoint- CPU: 4-8 cores
- Memory: 12-16 GB
- GPU: 1x NVIDIA GPU (16+ GB VRAM)
GPU Requirements
Lightning TTS requires NVIDIA GPU with CUDA support:Environment Variables
Model Loading
On first startup, Lightning TTS:- Loads TTS models from container (embedded)
- Validates model integrity
- Loads model into GPU memory
- Performs warmup inference
Logs
Key log messages:Performance
Typical performance metrics:Dependencies
- Requires License Proxy for validation
- Requires Redis for request coordination
- Requires NVIDIA GPU
License Proxy
Validates license keys and reports usage to Smallest servers.Purpose
- Validates license keys on startup
- Reports usage metadata to Smallest
- Provides grace period for offline operation
- Acts as licensing gateway for all services
Container Details
quay.io/smallestinc/license-proxy:latest3369 - License validation endpoint (internal)- CPU: 0.25-1 core
- Memory: 256-512 MB
- No GPU required
Environment Variables
Network Requirements
Validation Process
- On startup, validates license key with Smallest servers
- Receives license terms and quotas
- Caches validation (valid for grace period)
- Periodically reports usage metadata
Usage Reporting
License Proxy reports only metadata:No audio or transcript data is transmitted to Smallest servers.
Offline Mode
If connection to license server fails:- Uses cached validation (24-hour grace period)
- Continues serving requests
- Logs warning messages
- Retries connection periodically
Logs
Key log messages:Redis
Provides caching and state management for the system.Purpose
- Request queuing and coordination
- Session state for streaming connections
- Caching of frequent requests
- Performance optimization
Container Details
redis:latest or redis:7-alpine6379 - Redis protocol- CPU: 0.5-1 core
- Memory: 512 MB - 1 GB
- No GPU required
Configuration Options
- Embedded Redis
- With Persistence
- With Authentication
- External Redis
Default configuration with minimal setup:
Data Stored
Redis stores:- Request queue state
- WebSocket session data
- Temporary audio chunks (streaming)
- Worker status and health
Data in Redis is temporary and can be safely cleared. No persistent state is stored.
Health Check
Built-in health check:Service Dependencies
Startup order and dependencies:Recommended Startup Sequence
- Redis - Starts immediately (5 seconds)
- License Proxy - Validates license (10-15 seconds)
- Lightning TTS - Loads models (30-60 seconds)
- API Server - Connects to services (5-10 seconds)

