Vercel AI features ⭐ Notable

Vercel AI Gateway adds GLM 5.2 Fast via Wafer at 170+ tokens/s

Štvrtok 25. júna 2026 • Source: Vercel

What's new

GLM 5.2 Fast by Z.ai is now in Vercel AI Gateway through the Wafer inference stack.
Average throughput is 170+ tokens/sec, ranging 120-250 TPS.
2x decode throughput versus other serverless providers.
Retains GLM 5.2 strengths: strong coding, usable 1M-token context, long-horizon tasks.
No markup, BYOK support, ZDR mode and per-key budgets.

Why it matters

Since its mid-June launch, GLM 5.2 has been climbing into the world-class open-agentic tier. Wafer adds the inference speed that makes the model competitive for real-time code agents — not just a cheap alternative, but a fast inference option.

How to try it

Available in Vercel AI Gateway under model glm-5.2-fast. Routing through Wafer is the default.

Open original source Vercel