ziyuan wang
zzzac
AI & ML interests
None yet
Organizations
None yet
LLM training
-
Scaling Laws for Downstream Task Performance of Large Language Models
Paper • 2402.04177 • Published • 20 -
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Paper • 2402.05546 • Published • 5 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 88 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 625
VIDEO
3D
-
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
Paper • 2403.05034 • Published • 22 -
V3D: Video Diffusion Models are Effective 3D Generators
Paper • 2403.06738 • Published • 30 -
FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
Paper • 2403.10242 • Published • 12
NLP
-
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
Stealing Part of a Production Language Model
Paper • 2403.06634 • Published • 91
TORead
-
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 38 -
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper • 2402.16822 • Published • 18 -
FuseChat: Knowledge Fusion of Chat Models
Paper • 2402.16107 • Published • 40 -
Multi-LoRA Composition for Image Generation
Paper • 2402.16843 • Published • 32
speech
LLM Inference
NLP
-
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
Stealing Part of a Production Language Model
Paper • 2403.06634 • Published • 91
LLM training
-
Scaling Laws for Downstream Task Performance of Large Language Models
Paper • 2402.04177 • Published • 20 -
Offline Actor-Critic Reinforcement Learning Scales to Large Models
Paper • 2402.05546 • Published • 5 -
SaulLM-7B: A pioneering Large Language Model for Law
Paper • 2403.03883 • Published • 88 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 625
TORead
-
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 38 -
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper • 2402.16822 • Published • 18 -
FuseChat: Knowledge Fusion of Chat Models
Paper • 2402.16107 • Published • 40 -
Multi-LoRA Composition for Image Generation
Paper • 2402.16843 • Published • 32
VIDEO
speech
3D
-
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
Paper • 2403.05034 • Published • 22 -
V3D: Video Diffusion Models are Effective 3D Generators
Paper • 2403.06738 • Published • 30 -
FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
Paper • 2403.10242 • Published • 12