Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

zhidong-gao's picture

26 1 7

zhidong-gao

zhidong-gao

dark-pen's profile picture

21world's profile picture

·

AI & ML interests

None yet

Organizations

zhidong-gao 's collections 11

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Paper • 2411.06469 • Published Nov 10, 2024 • 17

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Paper • 2402.18842 • Published Feb 29, 2024 • 15

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Paper • 2406.17660 • Published Jun 25, 2024 • 5
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Paper • 2407.11239 • Published Jul 15, 2024 • 8

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

DeepSpeak Dataset v1.0

Paper • 2408.05366 • Published Aug 9, 2024 • 14

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Paper • 2408.10635 • Published Aug 20, 2024 • 14
Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

Paper • 2408.09787 • Published Aug 19, 2024 • 10

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
AnimateDiff-Lightning: Cross-Model Diffusion Distillation

Paper • 2403.12706 • Published Mar 19, 2024 • 18
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Paper • 2402.19481 • Published Feb 29, 2024 • 22
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1, 2024 • 46
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Paper • 2410.08261 • Published Oct 10, 2024 • 52

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Paper • 2403.03100 • Published Mar 5, 2024 • 38
MooER: LLM-based Speech Recognition and Translation Models from Moore Threads

Paper • 2408.05101 • Published Aug 9, 2024 • 7

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 79
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Paper • 2406.12034 • Published Jun 17, 2024 • 16
Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

Paper • 2408.08072 • Published Aug 15, 2024 • 34
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 73

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Paper • 2411.06469 • Published Nov 10, 2024 • 17

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
AnimateDiff-Lightning: Cross-Model Diffusion Distillation

Paper • 2403.12706 • Published Mar 19, 2024 • 18
Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 17

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Paper • 2402.18842 • Published Feb 29, 2024 • 15

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Paper • 2402.19481 • Published Feb 29, 2024 • 22
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1, 2024 • 46
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Paper • 2410.08261 • Published Oct 10, 2024 • 52

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Paper • 2406.17660 • Published Jun 25, 2024 • 5
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Paper • 2407.11239 • Published Jul 15, 2024 • 8

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Paper • 2403.03100 • Published Mar 5, 2024 • 38
MooER: LLM-based Speech Recognition and Translation Models from Moore Threads

Paper • 2408.05101 • Published Aug 9, 2024 • 7

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 79
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Paper • 2406.12034 • Published Jun 17, 2024 • 16
Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3

DeepSpeak Dataset v1.0

Paper • 2408.05366 • Published Aug 9, 2024 • 14

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

Paper • 2408.08072 • Published Aug 15, 2024 • 34
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 73

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Paper • 2408.10635 • Published Aug 20, 2024 • 14
Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

Paper • 2408.09787 • Published Aug 19, 2024 • 10

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs