Skip to main content

Common Issues

GPU Not Accessible

Symptoms:
  • Error: could not select device driver "nvidia"
  • Error: no NVIDIA GPU devices found
  • Lightning TTS fails to start
Diagnosis:
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
sudo systemctl restart docker
docker compose up -d
sudo apt-get remove nvidia-container-toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

sudo systemctl restart docker
nvidia-smi
If driver version is below 470, update:
sudo ubuntu-drivers autoinstall
sudo reboot
Verify /etc/docker/daemon.json contains:
{
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}
Restart Docker after changes:
sudo systemctl restart docker

License Validation Failed

Symptoms:
  • Error: License validation failed
  • Error: Invalid license key
  • Services fail to start
Diagnosis: Check license-proxy logs:
docker compose logs license-proxy
Check .env file:
cat .env | grep LICENSE_KEY
Ensure there are no:
  • Extra spaces
  • Quotes around the key
  • Line breaks
Correct format:
LICENSE_KEY=abc123def456
Test connection to license server:
curl -v https://console-api.smallest.ai
If this fails, check:
  • Firewall rules
  • Proxy settings
  • DNS resolution
If the key appears correct and network is accessible, your license may be:
  • Expired
  • Revoked
  • Invalid
Contact [email protected] with:
  • Your license key
  • License-proxy logs
  • Error messages

Model Loading Failed

Symptoms:
  • Lightning TTS stuck at startup
  • Error: Failed to load model
  • Container keeps restarting
Diagnosis: Check Lightning TTS logs:
docker compose logs lightning-tts
Verify GPU has enough VRAM:
nvidia-smi
Lightning TTS requires minimum 16GB VRAM.
Models require space:
df -h
Free up space if needed:
docker system prune -a
Models may need more time to load:
lightning-tts:
  healthcheck:
    start_period: 120s

Port Already in Use

Symptoms:
  • Error: port is already allocated
  • Error: bind: address already in use
Diagnosis: Find what’s using the port:
sudo lsof -i :7100
sudo netstat -tulpn | grep 7100
If another service is using the port:
sudo systemctl stop [service-name]
Or kill the process:
sudo kill -9 [PID]
Modify docker-compose.yml to use different port:
api-server:
  ports:
    - "8080:7100"
Access API at http://localhost:8080 instead
Old containers may still be bound:
docker compose down
docker container prune -f
docker compose up -d

Out of Memory

Symptoms:
  • Container killed unexpectedly
  • Error: OOMKilled
  • System becomes unresponsive
Diagnosis: Check container status:
docker compose ps
docker inspect [container-name] | grep OOMKilled
Lightning TTS requires minimum 16 GB RAMCheck current memory:
free -h
Prevent one service from consuming all memory:
services:
  lightning-tts:
    deploy:
      resources:
        limits:
          memory: 14G
        reservations:
          memory: 12G
Add swap space (temporary solution):
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

Slow Performance

Symptoms:
  • High latency (>500ms)
  • Low throughput
  • GPU underutilized
Diagnosis: Monitor GPU usage:
watch -n 1 nvidia-smi
Check container resources:
docker stats
Ensure GPU is not throttling:
nvidia-smi -q -d PERFORMANCE
Enable persistence mode:
sudo nvidia-smi -pm 1
lightning-tts:
  deploy:
    resources:
      limits:
        cpus: '8'
Use Redis with persistence disabled for speed:
redis:
  command: redis-server --save ""

Performance Optimization

Best Practices

1

Enable GPU Persistence Mode

Reduces GPU initialization time:
sudo nvidia-smi -pm 1
2

Optimize Container Resources

Allocate appropriate CPU/memory:
deploy:
  resources:
    limits:
      cpus: '8'
      memory: 14G
3

Monitor and Tune

Use monitoring tools:
docker stats
nvidia-smi dmon

Benchmark Your Deployment

Test TTS performance:
time curl -X POST http://localhost:7100/v1/speak \
  -H "Authorization: Token ${LICENSE_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "This is a test of the text-to-speech service.",
    "voice": "default"
  }'
Expected performance:
  • Cold start: First request after container start (5-10 seconds)
  • Warm requests: Subsequent requests (100-300ms)
  • Real-time factor: 0.1-0.3x

Debugging Tools

View All Logs

docker compose logs -f

Follow Specific Service

docker compose logs -f lightning-tts

Last N Lines

docker compose logs --tail=100 api-server

Save Logs to File

docker compose logs > deployment-logs.txt

Execute Commands in Container

docker compose exec lightning-tts bash

Check Container Configuration

docker inspect lightning-tts

Network Debugging

Test connectivity between containers:
docker compose exec api-server ping lightning-tts
docker compose exec api-server curl http://lightning-tts:8876/health

Health Checks

API Server

curl http://localhost:7100/health
Expected: {"status": "healthy"}

Lightning TTS

curl http://localhost:8876/health
Expected: {"status": "ready", "gpu": "NVIDIA A10"}

License Proxy

docker compose exec license-proxy wget -q -O- http://localhost:3369/health
Expected: {"status": "valid"}

Redis

docker compose exec redis redis-cli ping
Expected: PONG

Getting Help

Before Contacting Support

Collect the following information:
1

System Information

docker version
docker compose version
nvidia-smi
uname -a
2

Container Status

docker compose ps > status.txt
docker stats --no-stream > resources.txt
3

Logs

docker compose logs > all-logs.txt
4

Configuration

Sanitize and include:
  • docker-compose.yml
  • .env (remove license key)

Contact Support

Email: [email protected] Include:
  • Description of the issue
  • Steps to reproduce
  • System information
  • Logs and configuration
  • License key (via secure channel)

What’s Next?