Maozhou Ge's picture

Maozhou Ge

Gmc2

·

GHGmc2

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

moonshotai/Kimi-K2-Thinking

liked a Space 7 days ago

HuggingFaceTB/smol-training-playbook

upvoted a collection 10 days ago

View all activity

Organizations

None yet

liked a model 1 day ago

moonshotai/Kimi-K2-Thinking

Text Generation • Updated about 6 hours ago • 12.5k • • 633

liked a Space 7 days ago

The Smol Training Playbook: The Secrets to Building World-Class LLMs

upvoted a collection 10 days ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 653

upvoted an article 18 days ago

Article

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

By

and 9 others •

Aug 18

• 31

liked a dataset 19 days ago

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated May 8 • 3.91M • 5k • 599

upvoted a collection 19 days ago

InternVL3.5-Core

This collection includes only the InternVL3.5 checkpoints that have completed the full training pipeline (i.e., Pretraining, SFT, MPO, Cascade RL). • 30 items • Updated Sep 28 • 12

upvoted 2 collections 22 days ago

Nemotron-Pre-Training-Dataset

7 items • Updated 3 days ago • 40

Inference Optimized Checkpoints (with Model Optimizer)

A collection of generative models quantized and optimized for inference with TensorRT Model Optimizer. • 43 items • Updated 3 days ago • 50

liked a dataset 23 days ago

lmms-lab/multimodal-open-r1-8k-verified

Viewer • Updated Jan 27 • 7.69k • 4.85k • 67

upvoted an article 24 days ago

Article

Fixing Gradient Accumulation

Oct 16, 2024

• 63

liked a model 28 days ago

google/siglip2-so400m-patch14-384

Zero-Shot Image Classification • 1B • Updated Feb 21 • 463k • 62

liked a dataset 29 days ago

Salesforce/Webscale-RL

Viewer • Updated 25 days ago • 1.11M • 9.86k • 79

upvoted a paper about 1 month ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1 • 17

liked a model about 1 month ago

deepseek-ai/DeepSeek-V3.2-Exp-Base

Text Generation • 685B • Updated about 1 month ago • 584 • 42

upvoted a collection about 1 month ago

DeepSeek-V3.2

2 items • Updated Sep 29 • 442

upvoted a paper about 2 months ago

Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 115

upvoted a collection about 2 months ago

Qwen3-VL

37 items • Updated 7 days ago • 375

liked 3 datasets about 2 months ago

Juelg/RPD-maniskill

Updated Jul 18 • 350 • 1

lmms-lab/LLaVA-OneVision-Data

Viewer • Updated May 24 • 3.94M • 20.5k • 221

open-r1/Mixture-of-Thoughts

Viewer • Updated May 26 • 699k • 5.98k • 286