VibeThinker: 3B Param Model That Beats Opus 4.5 on Reasoning
What happened
An arXiv paper introduced VibeThinker, a 3B-parameter model the authors claim outperforms Claude Opus 4.5 on reasoning benchmarks via supervised fine-tuning plus GRPO (group relative policy optimization).
Context and impact
If the claim holds up to independent testing, it is a meaningful data point that reasoning capability is not strictly linear with model size — echoing Phi or Qwen progress. For practitioners it implies frontier-grade reasoning available on a single GPU. The HN discussion is unusually intense, signaling serious community interest.
Details
- Size: 3B parameters
- Method: SFT + GRPO
- Claim: beats Opus 4.5 on reasoning
- HN: 305 score, 162 comments in <12h
Open original source
Hacker News