Vercel AI features 🔥 Top

Realtime voice, speech, and transcription now supported on AI Gateway

Utorok 30. júna 2026 • Source: Vercel Changelog

What's new

Realtime voice agents: low-latency speak-and-respond loops with mid-dialogue tool calls
Text-to-speech: convert text into spoken audio with selectable voices and MP3 output
Speech-to-text: transcribe files, base64 strings, or URLs
OpenAI gpt-realtime-2 supported: first realtime model wired through the Gateway
Same governance: observability, spend caps, and BYOK identical to text/image/video models
Available via AI SDK 7: drop-in for existing Vercel AI SDK apps

Why it matters

Voice agents are now a first-class primitive on Vercel, removing the need for bespoke realtime pipelines. The unified governance is significant for teams that need usage limits, key control, and tracing across modalities.

How to try

Upgrade to AI SDK 7 and call a supported model (e.g. openai/gpt-realtime-2) through the AI Gateway, or try it directly in the Gateway playground.

Open original source Vercel Changelog