JunHowie commited on
Commit
2a42f96
·
verified ·
1 Parent(s): fa6b8f7

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +135 -0
README.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - multilingual
4
+ license: mit
5
+ license_link: https://huggingface.co/moonshotai/Kimi-Dev-72B/blob/main/LICENSE.md
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - GPTQ
10
+ - Int8
11
+ - vLLM
12
+ - code
13
+ - swebench
14
+ - software
15
+ - issue-resolving
16
+ base_model:
17
+ - moonshotai/Kimi-Dev-72B
18
+ base_model_relation: quantized
19
+ ---
20
+ # Kimi-Dev-72B-GPTQ-Int8
21
+ Base model: [moonshotai/Kimi-Dev-72B](https://huggingface.co/moonshotai/Kimi-Dev-72B)
22
+
23
+ <i>Calibrate using the https://huggingface.co/datasets/timdettmers/openassistant-guanaco/blob/main/openassistant_best_replies_eval.jsonl dataset.</i>
24
+ <br>
25
+ <i>The quantization configuration is as follows</i>
26
+
27
+ ```
28
+ quant_config = QuantizeConfig(bits=8, group_size=128, desc_act=False)
29
+ ```
30
+
31
+ ### 【vLLM Startup Command】
32
+ ```
33
+ vllm serve JunHowie/Kimi-Dev-72B-GPTQ-Int8
34
+ ```
35
+
36
+
37
+ ### 【Model Download】
38
+
39
+ ```python
40
+ from huggingface_hub import snapshot_download
41
+ snapshot_download('JunHowie/Kimi-Dev-72B-GPTQ-Int8', cache_dir="your_local_path")
42
+ ```
43
+
44
+ ### 【Overview】
45
+ <!-- # Kimi-Dev -->
46
+
47
+ <div align="center">
48
+ <img src="./assets/main_logo.png" alt="Kimi Logo" width="400" />
49
+ <h2><a href="https://moonshotai.github.io/Kimi-Dev/">
50
+ Introducing Kimi-Dev: <br>A Strong and Open-source Coding LLM for Issue Resolution</a></h2>
51
+ </a></h2>
52
+ <b>Kimi-Dev Team</b>
53
+ <br>
54
+
55
+ </div>
56
+ <div align="center">
57
+ <a href="">
58
+ <b>📄 Tech Report (Coming soon...)</b>
59
+ </a> &nbsp;|&nbsp;
60
+ <a href="https://github.com/MoonshotAI/Kimi-Dev">
61
+ <b>📄 Github</b>
62
+ </a> &nbsp;
63
+ </div>
64
+
65
+ <br>
66
+ <br>
67
+
68
+ <!-- https://github.com/MoonshotAI/Kimi-Dev -->
69
+
70
+ We introduce Kimi-Dev-72B, our new open-source coding LLM for software engineering tasks. Kimi-Dev-72B achieves a new state-of-the-art on SWE-bench Verified among open-source models.
71
+
72
+ - Kimi-Dev-72B achieves 60.4% performance on SWE-bench Verified. It surpasses the runner-up, setting a new state-of-the-art result among open-source models.
73
+
74
+
75
+ - Kimi-Dev-72B is optimized via large-scale reinforcement learning. It autonomously patches real repositories in Docker and gains rewards only when the entire test suite passes. This ensures correct and robust solutions, aligning with real-world development standards.
76
+
77
+
78
+ - Kimi-Dev-72B is available for download and deployment on Hugging Face and GitHub. We welcome developers and researchers to explore its capabilities and contribute to development.
79
+
80
+
81
+ <div align="center">
82
+ <img src="./assets/open_performance_white.png" alt="Kimi Logo" width="600" />
83
+ <p><b>Performance of Open-source Models on SWE-bench Verified.</b></p>
84
+
85
+ </div>
86
+
87
+
88
+
89
+ ## Quick Start
90
+ ```
91
+ from transformers import AutoModelForCausalLM, AutoTokenizer
92
+
93
+ model_name = "moonshotai/Kimi-Dev-72B"
94
+
95
+ model = AutoModelForCausalLM.from_pretrained(
96
+ model_name,
97
+ torch_dtype="auto",
98
+ device_map="auto"
99
+ )
100
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
101
+
102
+ prompt = "Give me a short introduction to large language model."
103
+ messages = [
104
+ {"role": "system", "content": "You are a helpful assistant."},
105
+ {"role": "user", "content": prompt}
106
+ ]
107
+ text = tokenizer.apply_chat_template(
108
+ messages,
109
+ tokenize=False,
110
+ add_generation_prompt=True
111
+ )
112
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
113
+
114
+ generated_ids = model.generate(
115
+ **model_inputs,
116
+ max_new_tokens=512
117
+ )
118
+ generated_ids = [
119
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
120
+ ]
121
+
122
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
123
+
124
+ ```
125
+
126
+ ## Citation
127
+ ```
128
+ @misc{kimi_dev_72b_2025,
129
+ title = {Introducing Kimi-Dev: A Strong and Open-source Coding LLM for Issue Resolution},
130
+ author = {{Kimi-Dev Team}},
131
+ year = {2025},
132
+ month = {June},
133
+ url = {\url{https://www.moonshot.cn/Kimi-Dev}}
134
+ }
135
+ ```