Docker Troubleshooting

Common Issues

GPU Not Accessible

Symptoms:

Error: could not select device driver "nvidia"
Error: no NVIDIA GPU devices found
Lightning TTS fails to start

Diagnosis:

docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

Solution 1: Restart Docker

sudo systemctl restart docker
docker compose up -d

Solution 2: Reinstall NVIDIA Container Toolkit

sudo apt-get remove nvidia-container-toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

sudo systemctl restart docker

Solution 3: Update NVIDIA Driver

nvidia-smi

If driver version is below 470, update:

sudo ubuntu-drivers autoinstall
sudo reboot

Solution 4: Check Docker Daemon Configuration

Verify /etc/docker/daemon.json contains:

{
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

Restart Docker after changes:

sudo systemctl restart docker

License Validation Failed

Symptoms:

Error: License validation failed
Error: Invalid license key
Services fail to start

Diagnosis: Check license-proxy logs:

docker compose logs license-proxy

Solution 1: Verify License Key

Check .env file:

cat .env | grep LICENSE_KEY

Ensure there are no:

Extra spaces
Quotes around the key
Line breaks

Correct format:

LICENSE_KEY=abc123def456

Solution 2: Check Network Connectivity

Test connection to license server:

curl -v https://console-api.smallest.ai

If this fails, check:

Firewall rules
Proxy settings
DNS resolution

Solution 3: Contact Support

If the key appears correct and network is accessible, your license may be:

Expired
Revoked
Invalid

Contact support@smallest.ai with:

Your license key
License-proxy logs
Error messages

Model Loading Failed

Symptoms:

Lightning TTS stuck at startup
Error: Failed to load model
Container keeps restarting

Diagnosis: Check Lightning TTS logs:

docker compose logs lightning-tts

Solution 1: Check GPU Memory

Verify GPU has enough VRAM:

nvidia-smi

Lightning TTS requires minimum 16GB VRAM.

Solution 2: Check Disk Space

Models require space:

df -h

Free up space if needed:

docker system prune -a

Solution 3: Increase Startup Time

Models may need more time to load:

lightning-tts:
  healthcheck:
    start_period: 120s

Port Already in Use

Symptoms:

Error: port is already allocated
Error: bind: address already in use

Diagnosis: Find what’s using the port:

sudo lsof -i :7100
sudo netstat -tulpn | grep 7100

Solution 1: Stop Conflicting Service

If another service is using the port:

sudo systemctl stop [service-name]

Or kill the process:

sudo kill -9 [PID]

Solution 2: Change Port

Modify docker-compose.yml to use different port:

api-server:
  ports:
    - "8080:7100"

Access API at http://localhost:8080 instead

Solution 3: Remove Old Containers

Old containers may still be bound:

docker compose down
docker container prune -f
docker compose up -d

Out of Memory

Symptoms:

Container killed unexpectedly
Error: OOMKilled
System becomes unresponsive

Diagnosis: Check container status:

docker compose ps
docker inspect [container-name] | grep OOMKilled

Solution 1: Increase System Memory

Lightning TTS requires minimum 16 GB RAMCheck current memory:

free -h

Solution 2: Add Memory Limits

Prevent one service from consuming all memory:

services:
  lightning-tts:
    deploy:
      resources:
        limits:
          memory: 14G
        reservations:
          memory: 12G

Solution 3: Enable Swap

Add swap space (temporary solution):

sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

Slow Performance

Symptoms:

High latency (>500ms)
Low throughput
GPU underutilized

Diagnosis: Monitor GPU usage:

watch -n 1 nvidia-smi

Check container resources:

docker stats

Solution 1: Optimize GPU Usage

Ensure GPU is not throttling:

nvidia-smi -q -d PERFORMANCE

Enable persistence mode:

sudo nvidia-smi -pm 1

Solution 2: Increase CPU Allocation

lightning-tts:
  deploy:
    resources:
      limits:
        cpus: '8'

Solution 3: Optimize Redis

Use Redis with persistence disabled for speed:

redis:
  command: redis-server --save ""

Performance Optimization

Best Practices

Enable GPU Persistence Mode

Reduces GPU initialization time:

sudo nvidia-smi -pm 1

Optimize Container Resources

Allocate appropriate CPU/memory:

deploy:
  resources:
    limits:
      cpus: '8'
      memory: 14G

Monitor and Tune

Use monitoring tools:

docker stats
nvidia-smi dmon

Benchmark Your Deployment

Test TTS performance:

time curl -X POST http://localhost:7100/v1/speak \
  -H "Authorization: Token ${LICENSE_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "This is a test of the text-to-speech service.",
    "voice": "default"
  }'

Expected performance:

Cold start: First request after container start (5-10 seconds)
Warm requests: Subsequent requests (100-300ms)
Real-time factor: 0.1-0.3x

Debugging Tools

View All Logs

docker compose logs -f

Follow Specific Service

docker compose logs -f lightning-tts

Last N Lines

docker compose logs --tail=100 api-server

Save Logs to File

docker compose logs > deployment-logs.txt

Execute Commands in Container

docker compose exec lightning-tts bash

Check Container Configuration

docker inspect lightning-tts

Network Debugging

Test connectivity between containers:

docker compose exec api-server ping lightning-tts
docker compose exec api-server curl http://lightning-tts:8876/health

Health Checks

API Server

curl http://localhost:7100/health

Expected: {"status": "healthy"}

Lightning TTS

curl http://localhost:8876/health

Expected: {"status": "ready", "gpu": "NVIDIA A10"}

License Proxy

docker compose exec license-proxy wget -q -O- http://localhost:3369/health

Expected: {"status": "valid"}

Redis

docker compose exec redis redis-cli ping

Expected: PONG

Getting Help

Before Contacting Support

Collect the following information:

System Information

docker version
docker compose version
nvidia-smi
uname -a

Container Status

docker compose ps > status.txt
docker stats --no-stream > resources.txt

Logs

docker compose logs > all-logs.txt

Configuration

Sanitize and include:

docker-compose.yml
.env (remove license key)

Contact Support

Email: support@smallest.ai Include:

Description of the issue
Steps to reproduce
System information
Logs and configuration
License key (via secure channel)

Getting Started

Docker Setup

Kubernetes Setup

Troubleshooting

Docker Troubleshooting

Common Issues

GPU Not Accessible

License Validation Failed

Model Loading Failed

Port Already in Use

Out of Memory

Slow Performance

Performance Optimization

Best Practices

Benchmark Your Deployment

Debugging Tools

View All Logs

Follow Specific Service

Last N Lines

Save Logs to File

Execute Commands in Container

Check Container Configuration

Network Debugging

Health Checks

API Server

Lightning TTS

License Proxy

Redis

Getting Help

Before Contacting Support

Contact Support

What’s Next?

TTS Configuration

API Reference

Getting Started

Docker Setup

Kubernetes Setup

Troubleshooting

​Common Issues

​GPU Not Accessible

​License Validation Failed

​Model Loading Failed

​Port Already in Use

​Out of Memory

​Slow Performance

​Performance Optimization

​Best Practices

​Benchmark Your Deployment

​Debugging Tools

​View All Logs

​Follow Specific Service

​Last N Lines

​Save Logs to File

​Execute Commands in Container

​Check Container Configuration

​Network Debugging

​Health Checks

​API Server

​Lightning TTS

​License Proxy

​Redis

​Getting Help

​Before Contacting Support

​Contact Support

​What’s Next?

TTS Configuration

API Reference

Common Issues

GPU Not Accessible

License Validation Failed

Model Loading Failed

Port Already in Use

Out of Memory

Slow Performance

Performance Optimization

Best Practices

Benchmark Your Deployment

Debugging Tools

View All Logs

Follow Specific Service

Last N Lines

Save Logs to File

Execute Commands in Container

Check Container Configuration

Network Debugging

Health Checks

API Server

Lightning TTS

License Proxy

Redis

Getting Help

Before Contacting Support

Contact Support

What’s Next?