zhidong-gao
zhidong-gao
AI & ML interests
None yet
Organizations
3D
Efficient
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
Mixture-of-Subspaces in Low-Rank Adaptation
Paper • 2406.11909 • Published • 3 -
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
Paper • 2406.17660 • Published • 5 -
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Paper • 2407.11239 • Published • 8
Attack
dataset
Agent
Video
SD
-
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Paper • 2402.19481 • Published • 22 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 46 -
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Paper • 2410.08261 • Published • 52
Audio
LLMs
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78 -
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Paper • 2406.12034 • Published • 16 -
Mixture-of-Subspaces in Low-Rank Adaptation
Paper • 2406.11909 • Published • 3
align
Medical
Video
3D
SD
-
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Paper • 2402.19481 • Published • 22 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 46 -
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Paper • 2410.08261 • Published • 52
Efficient
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
Mixture-of-Subspaces in Low-Rank Adaptation
Paper • 2406.11909 • Published • 3 -
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
Paper • 2406.17660 • Published • 5 -
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Paper • 2407.11239 • Published • 8
Audio
Attack
LLMs
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78 -
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Paper • 2406.12034 • Published • 16 -
Mixture-of-Subspaces in Low-Rank Adaptation
Paper • 2406.11909 • Published • 3
dataset
align
Agent