2 7 1

Wei Du

weidu

AI & ML interests

None yet

Recent Activity

upvoted an article 11 days ago

Transformers v5: Simple model definitions powering the AI ecosystem

updated a Space 9 months ago

weidu/AlfredAgent

published a Space 9 months ago

weidu/AlfredAgent

View all activity

Organizations

upvoted an article 11 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

30 days ago

•

259

updated a Space 9 months ago

AlfredAgent

💻

published a Space 9 months ago

AlfredAgent

💻

upvoted 2 papers 10 months ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 170

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91

liked a Space 10 months ago

The Ultra-Scale Playbook

🌌

3.61k

The ultimate guide to training LLM on large GPU Clusters

upvoted an article 10 months ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

Feb 19

•

upvoted 2 papers over 1 year ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Paper • 2407.15841 • Published Jul 22, 2024 • 40

commented a paper almost 2 years ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 134 •

upvoted a paper almost 2 years ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 134

Wei Du

AI & ML interests

Recent Activity

Organizations

weidu's activity

Transformers v5: Simple model definitions powering the AI ecosystem

AlfredAgent

AlfredAgent

The Ultra-Scale Playbook

PaliGemma 2 Mix - New Instruction Vision Language Models by Google