Best open source model ever, period.

by BigBlueWhale - opened 25 days ago

25 days ago

•

Nobody likes 30b-a3b models. The quality is unusable. And as a private researcher with only an Nvidia RTX-5090, I find the 32b dense models to be infinitely more useful and reliable. Especially for edge cases and niche prompts. Honestly, in my experience the MoEs are only good at benchmarks.

Honestly, I understand. As I have estimated how much a 30b-a3b model training would cost in cloud compute prices, versus a 32b dense model, and the dense model is so much more expensive to train.

Here's a graphic that shows the true difference (Note: my numbers are probably off by a factor)

I'm so excited and grateful to Alibaba for making such an investment for use by the community and researchers such as myself ♥️

A few months ago I wrote an opinion article showing why Qwen3-32B is the best open source model ever and I definitely have the highest hopes for this vision-capable release.

Tom9000

25 days ago

I agree.
After trying and failing at getting moe models to work with me, instead of against me, for nearly a whole day, few months back, I gave up and stuck with 32b dense ever since, and I just ignored every moe release ever since.
So I'm very thankful to qwen team for another 32b dense release. I was getting a little worried we will never see another one.

imoc

24 days ago

80B-A3B is really good though, easily beat original 3-32B.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment