Skip to main content
Kubernetes deployment is currently only available for ASR (Speech-to-Text). TTS (Text-to-Speech) Kubernetes support is coming soon. For TTS deployments, please use Docker.

Overview

Before deploying Smallest Self-Host ASR on Kubernetes, ensure your cluster meets the requirements and you have the necessary tools and credentials.

Kubernetes Cluster Requirements

Minimum Cluster Specifications

Kubernetes Version

v1.19 or higherv1.24+ recommended

Node Count

Minimum 2 nodes
  • 1 CPU node (control plane/general)
  • 1 GPU node (Lightning ASR)

Total Resources

Minimum cluster capacity
  • 8 CPU cores
  • 32 GB RAM
  • 1 NVIDIA GPU

Storage

Persistent volume support
  • Storage class available
  • 100 GB minimum capacity
We recommend using L4 or L40s for the best performance.

Required Tools

Install the following tools on your local machine:

Helm

Helm 3.0 or higher is required.
brew install helm
Verify installation:
helm version

kubectl

Kubernetes CLI tool for cluster management.
brew install kubectl
Verify installation:
kubectl version --client

Cluster Access

Configure kubectl

Ensure kubectl is configured to access your cluster:
kubectl cluster-info
kubectl get nodes
Expected output should show your cluster nodes.

Test Cluster Access

Verify you have sufficient permissions:
kubectl auth can-i create deployments
kubectl auth can-i create services
kubectl auth can-i create secrets
All should return yes.

GPU Support

NVIDIA GPU Operator

For Kubernetes clusters, install the NVIDIA GPU Operator to manage GPU resources.
The Smallest Self-Host Helm chart includes the GPU Operator as an optional dependency. You can enable it during installation or install it separately.

Verify GPU Nodes

Check that GPU nodes are properly labeled:
kubectl get nodes -l node.kubernetes.io/instance-type
Verify GPU resources are available:
kubectl get nodes -o json | jq '.items[].status.capacity'
Look for nvidia.com/gpu in the capacity.

Credentials

Obtain the following from Smallest.ai before installation:
Your unique license key for validationContact: [email protected]You’ll add this to values.yaml:
global:
  licenseKey: "your-license-key-here"
Credentials to pull Docker images from quay.io:
  • Username
  • Password
  • Email
Contact: [email protected]You’ll add these to values.yaml:
global:
  imageCredentials:
    username: "your-username"
    password: "your-password"
    email: "your-email"
Download URL for ASR modelsContact: [email protected]You’ll add this to values.yaml:
models:
  asrModelUrl: "your-model-url"

Storage Requirements

Storage Class

Verify a storage class is available:
kubectl get storageclass
You should see at least one storage class marked as (default) or available.

For AWS Deployments

If deploying on AWS EKS, you’ll need:
  • EBS CSI Driver for block storage
  • EFS CSI Driver for shared file storage (recommended for model storage)
See the AWS Deployment guide for detailed setup instructions.

Network Requirements

Required Ports

Ensure the following ports are accessible within the cluster:
PortServicePurpose
7100API ServerClient API requests
2269Lightning ASRInternal ASR processing
3369License ProxyInternal license validation
6379RedisInternal caching

External Access

The License Proxy requires outbound HTTPS access to:
  • console-api.smallest.ai (port 443)
Ensure your cluster’s network policies and security groups allow outbound HTTPS traffic from pods.

Optional Components

Prometheus & Grafana

For monitoring and autoscaling based on custom metrics:
  • Prometheus Operator (included in chart)
  • Grafana (included in chart)
  • Prometheus Adapter (included in chart)
These are required for:
  • Custom metrics-based autoscaling
  • Advanced monitoring dashboards
  • Performance visualization

Cluster Autoscaler

For automatic node scaling on AWS EKS:
  • IAM role with autoscaling permissions
  • IRSA (IAM Roles for Service Accounts) configured
See the Cluster Autoscaler guide for setup.

Namespace

Decide on a namespace for deployment:
Deploy to the default namespace:
kubectl config set-context --current --namespace=default

Verification Checklist

Before proceeding, ensure:
1

Cluster Access

kubectl get nodes
Shows all cluster nodes in Ready state
2

GPU Nodes Available

kubectl get nodes -o json | jq '.items[].status.capacity."nvidia.com/gpu"'
Shows GPU count for GPU nodes
3

Helm Installed

helm version
Shows Helm 3.x
4

Storage Available

kubectl get storageclass
Shows at least one storage class
5

Credentials Ready

  • License key obtained
  • Container registry credentials
  • Model download URL
6

Sufficient Resources

kubectl top nodes
Shows available resources for deployment

AWS-Specific Prerequisites

If deploying on AWS EKS, see:

AWS EKS Setup

Complete guide for setting up EKS cluster with GPU support

What’s Next?

Once all prerequisites are met, proceed to the quick start:

Quick Start

Deploy Smallest Self-Host with Helm