What should you use?
Pros | Cons |
---|---|
Simple to integrate with standard HTTP tools | Full audio is returned only after complete synthesis |
Easy to debug and monitor | Not suitable for real-time or long-form audio |
Stateless; good for serverless environments | Reconnect needed for each request |
Works well with caching and CDNs | Higher latency compared to streaming methods |
Pros | Cons |
---|---|
Lower latency than regular HTTP | Only one-way communication (client → server) |
Compatible with standard HTTP infrastructure | Full input must still be sent before synthesis starts |
Audio starts playing as it’s generated | No partial or live input updates |
Easy to adopt with minimal changes | Slightly more complex than basic HTTP |
Pros | Cons |
---|---|
Ultra low latency | More complex to implement and manage |
Supports real-time, chunked input and responses | Requires persistent connection management |
Bi-directional communication | Not ideal for simple or infrequent tasks |
Great for chatbots, live agents, or dictation apps | May require additional libraries or WebSocket support |