Back to section
Výskum 🔥 Top

DeepSeek releases DSpark speculative decoding, 60–85% faster V4 inference

Nedeľa 28. júna 2026 Source: MarkTechPost

What happened

DeepSeek on June 27 open-sourced DSpark — "Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation." The framework dramatically speeds up per-user generation on both DeepSeek-V4 Flash and Pro variants without requiring any retraining of the underlying model.

Context and impact

This is a serialization-of-inference breakthrough that lets V4 feel substantially faster on existing hardware — critical given US export controls toward China. It is also DeepSeek's first major technical release since its latest funding round, confirming continued strong open-source contributions out of China in core infrastructure.

Details

  • Speedup: 60–85% on V4 Flash, 57–78% on V4 Pro vs. MTP-1 baseline
  • Pairs parallel draft backbone with sequential head + confidence head + load-aware scheduler
  • Open-source checkpoints and training code
  • Paper at arXiv:2606.19348
  • Top of Hacker News (771 points, 330 comments)
Open original source MarkTechPost