I've been busy working on some new ranking/position methodologies and excited to start sharing some results.
Plot legends:
- X = truncation rate (low = good) - ? = confusion rate (low = good) - blue bars = average completion tokens (low = good) - black diamonds = CI-banded performance (high = good) - cluster squares = models inside this group are equivalent
openai/gpt-oss-120b remains the king in all dimensions of interest: truncation rates, completion lengths and performance. If I had but one complaint it's the reason_effort does not seem to actually work - more on this soon.
Second is a 3-way tie in performance between the Qwen3-235B-2507 we all know and love with an unexpected entrant - ByteDance-Seed/Seed-OSS-36B-Instruct
This is a very capable model and it's reasoning effort controls actually works, but you should absolutely not leave it on the default "unlimited" - enable a sensible limit (4k works well for 8k context length).
Third place is another 3-way tie, this one between Seed-OSS-36B (it straddles the CI boundary between 2nd and 3rd place), Qwen/Qwen3-Next-80B-A3B-Instruct (demonstrating that full attention may be overrated after all and gated is the way to go) and the newly released zai-org/GLM-4.7 which offers excellent across the board performance with some of the shortest reasoning traces I've seen so far.
reacted to dhruv3006's
post with ❤️about 19 hours ago
Git is powerful, but it’s also one of the biggest sources of developer mistakes.
What is Git GUI, and how does it help here ?
Git GUI makes version control visual, predictable, and easier to reason about especially when things go wrong.
That’s exactly why we built Git GUI in Voiden.
Instead of relying on memorized commands, Voiden lets you see what Git is doing before it does it.
What Voiden’s Git GUI helps developers do • View exact file and line-level changes before committing • Stage only intended changes (no accidental commits) • Clearly distinguish staged vs unstaged files • Inspect visual diffs with full context • Understand branches, commit history, and repo state instantly
When Git behavior is hidden, errors increase. Voiden’s Git GUI doesn’t abstract Git away, it explains Git.
Whether you’re new to Git or an experienced developer who prefers clarity, this is Git you can reason about.
Version control should feel safe, not stressful.
What Git pain points slow you down today?
Try out Git GUI in beta : https://voiden.md ( Now in Linux and Mac )
2 replies
·
reacted to MohamedRashad's
post with ❤️about 19 hours ago
Atom-27B has arrived! This model is the largest open-weight model so far from VANTA Research, and is our 4th model in Project Atom - an effort to scale our collaborative Atom persona from 4B-400B+
Atom-27B is based on Google's Gemma 3 27B architecture, and embodies the familiar friendly, warm, and curious persona that appeared in previous releases.
Atom is designed to think WITH you, not FOR you - marking VANTA Research's commitment to building frontier collaborative models.
Atom-27B has arrived! This model is the largest open-weight model so far from VANTA Research, and is our 4th model in Project Atom - an effort to scale our collaborative Atom persona from 4B-400B+
Atom-27B is based on Google's Gemma 3 27B architecture, and embodies the familiar friendly, warm, and curious persona that appeared in previous releases.
Atom is designed to think WITH you, not FOR you - marking VANTA Research's commitment to building frontier collaborative models.