caimz commited on
Commit
705e166
·
verified ·
1 Parent(s): ced1a6c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Belt_Road_Hungarian
2
+
3
+ ## Model Description
4
+
5
+ This model is a conversational and instruction-following large language model, fine-tuned from the foundational open-source **Qwen2.5-72B-Instruct** model using supervised fine-tuning (SFT).
6
+
7
+ -----
8
+
9
+ ## Key Features & Use Cases
10
+
11
+ * **Exceptional Hungarian Language Proficiency:** The model has been deeply optimized for Hungarian, demonstrating excellent fluency, accuracy, and a strong understanding of cultural context in conversations.
12
+ * **Multilingual Translation and Dialogue:** With extensive training data that includes Hungarian, English, and Chinese content, the model excels in translation, multilingual Q\&A, and cross-language communication.
13
+ * **Advanced Instruction Following:** The model shows a strong ability to comprehend and execute complex instructions, including those with multiple steps and specific constraints.
14
+ * **Creative Content Generation:** It is highly suitable for a wide range of creative tasks, such as writing articles, reports, scripts, and marketing copy.
15
+
16
+ -----
17
+
18
+ ## System Requirements
19
+
20
+ ### Hardware
21
+
22
+ * **GPU VRAM:** For BF16/FP16 inference (recommended), at least **4 x NVIDIA A100 (80GB)** GPUs are required. The model weights alone are approximately 136GB, so `device_map="auto"` is necessary to distribute them across multiple cards.
23
+ * **System RAM:** A minimum of **200GB** is recommended.
24
+
25
+ ### Software
26
+
27
+ * **Python:** Version 3.10 or higher.
28
+ * **Key Libraries:**
29
+ * `torch`: 2.1 or higher
30
+ * `transformers`: 4.41.0 or higher
31
+ * `accelerate`: 1.7.0 or higher
32
+ * `einops`: 0.8.1 or higher
33
+ * `sentencepiece`: 0.2.0 or higher
34
+
35
+ -----
36
+
37
+ ## How to Use
38
+
39
+ The recommended method for loading and running the model is by using the **`transformers`** library.
40
+
41
+ ```python
42
+ # Example code snippet for inference using transformers
43
+ from transformers import AutoModelForCausalLM, AutoTokenizer
44
+ import torch
45
+
46
+ # Specify the model path or your Hugging Face Hub repository
47
+ model_path = "your-huggingface-repo/your-model-name" # e.g., "your-user/your-qwen-model"
48
+
49
+ # Load the tokenizer
50
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
51
+
52
+ # Load the model with device_map to distribute it across available GPUs
53
+ model = AutoModelForCausalLM.from_pretrained(
54
+ model_path,
55
+ torch_dtype=torch.bfloat16, # or torch.float16
56
+ device_map="auto"
57
+ )
58
+
59
+ # Example conversation prompt
60
+ messages = [
61
+ {"role": "system", "content": "You are a helpful assistant."},
62
+ {"role": "user", "content": "Hello, can you translate 'hello' to Hungarian and Chinese?"}
63
+ ]
64
+
65
+ # Apply the chat template and generate a response
66
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
67
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
68
+
69
+ generated_ids = model.generate(
70
+ model_inputs.input_ids,
71
+ max_new_tokens=512,
72
+ do_sample=True,
73
+ temperature=0.7,
74
+ top_p=0.9
75
+ )
76
+ generated_ids = [
77
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
78
+ ]
79
+
80
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
81
+ print(response)
82
+ ```