Skip to content
Cloudflare Docs

WebSockets API

The AI Gateway WebSockets API provides a persistent connection for AI interactions, eliminating repeated handshakes and reducing latency. This API is divided into two categories:

  • Realtime APIs - Designed for AI providers that offer low-latency, multimodal interactions over WebSockets.
  • Non-Realtime APIs - Supports standard WebSocket communication for AI providers, including those that do not natively support WebSockets.

When to use WebSockets?

WebSockets are long-lived TCP connections that enable bi-directional, real-time and non realtime communication between client and server. Unlike HTTP connections, which require repeated handshakes for each request, WebSockets maintain the connection, supporting continuous data exchange with reduced overhead. WebSockets are ideal for applications needing low-latency, real-time data, such as voice assistants.

Key benefits

  • Reduced overhead: Avoid overhead of repeated handshakes and TLS negotiations by maintaining a single, persistent connection.
  • Provider compatibility: Works with all AI providers in AI Gateway. Even if your chosen provider does not support WebSockets, Cloudflare handles it for you, managing the requests to your preferred AI provider.

Key differences

FeatureRealtime APIsNon-Realtime APIs
PurposeEnables real-time, multimodal AI interactions for providers that offer dedicated WebSocket endpoints.Supports WebSocket-based AI interactions with providers that do not natively support WebSockets.
Use CaseStreaming responses for voice, video, and live interactions.Text-based queries and responses, such as LLM requests.
AI Provider SupportLimited to providers offering real-time WebSocket APIs.All AI providers in AI Gateway.
Streaming SupportProviders natively support real-time data streaming.AI Gateway handles streaming via WebSockets.

For details on implementation, refer to the next sections: