Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ZHANG Mingxing's picture
1

ZHANG Mingxing

zhang-mingxing
21world's profile picture Gargaz's profile picture
·
https://madsys.cs.tsinghua.edu.cn/~zhangmx/
  • james0zan

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago
Efficient and Economic Large Language Model Inference with Attention Offloading
authored a paper about 2 months ago
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving
authored a paper about 2 months ago
MoBA: Mixture of Block Attention for Long-Context LLMs
View all activity

Organizations

KVCache.ai's profile picture

authored 4 papers about 2 months ago

Efficient and Economic Large Language Model Inference with Attention Offloading

Paper • 2405.01814 • Published May 3, 2024

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Paper • 2407.00079 • Published Jun 24, 2024 • 5

MoBA: Mixture of Block Attention for Long-Context LLMs

Paper • 2502.13189 • Published Feb 18 • 17

RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

Paper • 2505.02922 • Published May 5 • 28
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs