EEVE-Korean-Custom-10.8B

๐Ÿ‡ฐ๐Ÿ‡ท Korean Custom Fine-tuning - Responds politely in formal Korean even to casual questions

English Documentation

Model Overview

This model is based on EEVE-Korean-Instruct-10.8B-v1.0, fine-tuned with high-quality Korean instruction data using LoRA, and subsequently merged into a standalone model.

Key Features:

  • High-quality Korean language processing trained on 100K+ instruction samples
  • Extended context support up to 8K tokens
  • Bilingual capabilities supporting both Korean and English

Quick Start

Installation:

pip install transformers torch accelerate

Basic Usage:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model (no PEFT required)
model = AutoModelForCausalLM.from_pretrained(
    "MyeongHo0621/eeve-vss-smh",  
    device_map="auto",
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("MyeongHo0621/eeve-vss-smh")

# Prompt template (EEVE format)
def create_prompt(user_input):
    return f"""A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {user_input}
Assistant: """

# Generate response
user_input = "Implement Fibonacci sequence in Python"
prompt = create_prompt(user_input)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.3,
    top_p=0.85,
    repetition_penalty=1.0,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Streaming Generation:

from transformers import TextIteratorStreamer
from threading import Thread

streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True)
generation_kwargs = {
    **inputs,
    "max_new_tokens": 512,
    "temperature": 0.3,
    "top_p": 0.85,
    "streamer": streamer
}

thread = Thread(target=model.generate, kwargs=generation_kwargs)
thread.start()

for text in streamer:
    print(text, end="", flush=True)

Training Details

Dataset Configuration:

  • Size: Approximately 100,000 samples
  • Sources: Combination of high-quality Korean instruction datasets including KoAlpaca, Ko-Ultrachat, KoInstruct, Kullm-v2, Smol Korean Talk, and Korean Wiki QA
  • Preprocessing: Length filtering, deduplication, language verification, and special character removal

LoRA Configuration:

r: 128                    # Higher rank for stronger learning
lora_alpha: 256           # alpha = 2 * r
lora_dropout: 0.0         # No dropout (Unsloth optimization)
target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj
bias: none
task_type: CAUSAL_LM
use_rslora: false

Training Hyperparameters:

Parameter Value Description
Framework Unsloth 2-5x faster than standard transformers
Epochs 3 (stopped at 1.94) Early stopping at optimal point
Batch Size 8 per device Maximizing H100E memory
Gradient Accumulation 2 Effective batch size of 16
Learning Rate 1e-4 Balanced learning rate
Max Sequence Length 4096 Extended context support
Warmup Ratio 0.05 Quick warmup
Weight Decay 0.01 Regularization
Optimizer AdamW 8-bit (Unsloth) Memory optimized
LR Scheduler Cosine Smooth decay
Gradient Checkpointing Unsloth optimized Memory efficient

Checkpoint Selection Strategy:

The model was trained for 3 epochs, but we selected checkpoint-6250 (Epoch 1.94) based on evaluation loss analysis:

Checkpoint Epoch Training Loss Eval Loss Status
6250 1.94 0.9986 1.4604 โœ… Selected (Best)
6500 2.02 0.561 1.5866 โŒ Overfitting

Key Insight: Training loss continued to decrease, but evaluation loss started increasing after checkpoint-6250, indicating overfitting. We selected the checkpoint with the lowest evaluation loss for optimal generalization.

Memory Optimization:

  • Full precision training (no 4-bit quantization needed on H100E)
  • Unsloth gradient checkpointing
  • BF16 training optimized for H100E
  • Peak VRAM usage: ~26GB during training

Training Infrastructure:

  • GPU: NVIDIA H100 80GB HBM3
  • Framework: Unsloth + PyTorch 2.6, Transformers 4.46.3
  • Training time: ~3 hours (6,250 steps with Unsloth acceleration)
  • Final checkpoint: Step 6250 (Epoch 1.94), merged to full model

Performance Examples

Casual to Formal Korean Conversion:

Input (casual Korean): "WMS๊ฐ€ ๋ญ์•ผ?"

Output (formal Korean): "WMS๋Š” Warehouse Management System์˜ ์•ฝ์ž๋กœ, ์ฐฝ๊ณ  ๊ด€๋ฆฌ ์‹œ์Šคํ…œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์žฌ๊ณ  ์ถ”์ , ์ž…์ถœ๊ณ  ๊ด€๋ฆฌ, ํ”ผํ‚น, ํŒจํ‚น ๋“ฑ์˜ ๋ฌผ๋ฅ˜ ํ”„๋กœ์„ธ์Šค๋ฅผ ์ž๋™ํ™”ํ•˜๊ณ  ์ตœ์ ํ™”ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. ํšจ์œจ์ ์ธ ์ฐฝ๊ณ  ์šด์˜์„ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋ฉฐ, ์‹ค์‹œ๊ฐ„ ์žฌ๊ณ  ๊ฐ€์‹œ์„ฑ๊ณผ ์ž‘์—… ์ƒ์‚ฐ์„ฑ ํ–ฅ์ƒ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค."

Code Generation:

Input: "ํŒŒ์ด์ฌ์œผ๋กœ ๋ฆฌ์ŠคํŠธ๋ฅผ ์—ญ์ˆœ์œผ๋กœ ๋งŒ๋“ค์–ด์ค˜"

Output: Provides three different Python methods for list reversal with detailed explanations of each approach, including reverse() method, slicing, and reversed() function, along with their differences.

Prompt Template

This model uses the standard EEVE template format:

template = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {user_message}
Assistant: """

Using this exact template is essential for optimal performance.

Recommended Generation Parameters

generation_config = {
    "max_new_tokens": 512,
    "temperature": 0.3,
    "top_p": 0.85,
    "repetition_penalty": 1.0,
    "do_sample": True,
    "pad_token_id": tokenizer.pad_token_id,
    "eos_token_id": tokenizer.eos_token_id,
}

Parameter Tuning Guide:

Use Case Temperature Top P Repetition Penalty Notes
Precise answers 0.1-0.3 0.8-0.9 1.0 Best for factual Q&A
Balanced responses 0.5-0.7 0.85-0.95 1.0 Recommended default
Creative outputs 0.8-1.0 0.9-1.0 1.05-1.1 For creative writing

Important Notes on Repetition Penalty:

  • Default (1.0): No penalty, natural repetition allowed
  • Light (1.05-1.1): Reduces minor repetition in creative tasks
  • Moderate (1.1-1.2): Good for reducing repetitive phrases
  • Strong (1.2+): May affect output quality, use with caution

โš ๏ธ Warning: Setting repetition_penalty > 1.2 can degrade Korean text quality. For this model, 1.0-1.1 is optimal for most use cases.

Advanced Configuration Example:

# For code generation
code_gen_config = {
    "max_new_tokens": 1024,
    "temperature": 0.2,
    "top_p": 0.9,
    "repetition_penalty": 1.0,
    "do_sample": True,
}

# For conversational responses
conversation_config = {
    "max_new_tokens": 512,
    "temperature": 0.7,
    "top_p": 0.9,
    "repetition_penalty": 1.05,
    "do_sample": True,
}

# For precise factual answers
factual_config = {
    "max_new_tokens": 256,
    "temperature": 0.1,
    "top_p": 0.85,
    "repetition_penalty": 1.0,
    "do_sample": True,
}

Limitations

This model has been released for research and educational purposes. Commercial use requires compliance with the CC-BY-NC-SA-4.0 license. While optimized for Korean language, the model provides partial support for other languages. Performance may improve with additional training beyond checkpoint 500.

License

Citation

@misc{eeve-vss-smh-2025,
  author = {MyeongHo0621},
  title = {EEVE-VSS-SMH: Korean Custom Fine-tuned Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/MyeongHo0621/eeve-vss-smh}},
  note = {LoRA fine-tuned and merged model based on EEVE-Korean-Instruct-10.8B-v1.0}
}

Acknowledgments

  • Base Model: Yanolja - EEVE-Korean-Instruct-10.8B-v1.0
  • Training Infrastructure: KT Cloud H100E
  • Framework: Hugging Face Transformers, PEFT

Contact


ํ•œ๊ตญ์–ด ๋ฌธ์„œ

๋ชจ๋ธ ์†Œ๊ฐœ

์ด ๋ชจ๋ธ์€ EEVE-Korean-Instruct-10.8B-v1.0์„ ๋ฒ ์ด์Šค๋กœ, ๊ณ ํ’ˆ์งˆ ํ•œ๊ตญ์–ด instruction ๋ฐ์ดํ„ฐ๋กœ LoRA ํŒŒ์ธํŠœ๋‹ํ•œ ํ›„ ๋ณ‘ํ•ฉํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

์ฃผ์š” ํŠน์ง•:

  • 100K+ ๊ณ ํ’ˆ์งˆ instruction ๋ฐ์ดํ„ฐ๋กœ ํ›ˆ๋ จ๋œ ํ•œ๊ตญ์–ด ์ฒ˜๋ฆฌ ๋Šฅ๋ ฅ
  • ์ตœ๋Œ€ 8K ํ† ํฐ๊นŒ์ง€ ํ™•์žฅ๋œ ๋ฌธ๋งฅ ์ง€์›
  • ํ•œ๊ตญ์–ด์™€ ์˜์–ด๋ฅผ ๋ชจ๋‘ ์ง€์›ํ•˜๋Š” ์ด์ค‘์–ธ์–ด ๊ธฐ๋Šฅ

๋น ๋ฅธ ์‹œ์ž‘

์„ค์น˜:

pip install transformers torch accelerate

๊ธฐ๋ณธ ์‚ฌ์šฉ:

from transformers import AutoModelForCausalLM, AutoTokenizer

# ๋ชจ๋ธ ๋กœ๋“œ (PEFT ๋ถˆํ•„์š”)
model = AutoModelForCausalLM.from_pretrained(
    "MyeongHo0621/eeve-vss-smh",  
    device_map="auto",
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("MyeongHo0621/eeve-vss-smh")

# ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ (EEVE ํ˜•์‹)
def create_prompt(user_input):
    return f"""A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {user_input}
Assistant: """

# ์‘๋‹ต ์ƒ์„ฑ
user_input = "ํŒŒ์ด์ฌ์œผ๋กœ ํ”ผ๋ณด๋‚˜์น˜ ์ˆ˜์—ด ๊ตฌํ˜„ํ•ด์ค˜"
prompt = create_prompt(user_input)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.3,
    top_p=0.85,
    repetition_penalty=1.0,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

์ŠคํŠธ๋ฆฌ๋ฐ ์ƒ์„ฑ:

from transformers import TextIteratorStreamer
from threading import Thread

streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True)
generation_kwargs = {
    **inputs,
    "max_new_tokens": 512,
    "temperature": 0.3,
    "top_p": 0.85,
    "streamer": streamer
}

thread = Thread(target=model.generate, kwargs=generation_kwargs)
thread.start()

for text in streamer:
    print(text, end="", flush=True)

ํ›ˆ๋ จ ์„ธ๋ถ€์‚ฌํ•ญ

๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ:

  • ํฌ๊ธฐ: ์•ฝ 100,000๊ฐœ ์ƒ˜ํ”Œ
  • ์ถœ์ฒ˜: KoAlpaca, Ko-Ultrachat, KoInstruct, Kullm-v2, Smol Korean Talk, Korean Wiki QA ๋“ฑ ๊ณ ํ’ˆ์งˆ ํ•œ๊ตญ์–ด instruction ๋ฐ์ดํ„ฐ์…‹ ์กฐํ•ฉ
  • ์ „์ฒ˜๋ฆฌ: ๊ธธ์ด ํ•„ํ„ฐ๋ง, ์ค‘๋ณต ์ œ๊ฑฐ, ์–ธ์–ด ํ™•์ธ, ํŠน์ˆ˜๋ฌธ์ž ์ œ๊ฑฐ

LoRA ์„ค์ •:

r: 128                    # ๋” ๋†’์€ rank (๊ฐ•๋ ฅํ•œ ํ•™์Šต)
lora_alpha: 256           # alpha = 2 * r
lora_dropout: 0.0         # Dropout ์—†์Œ (Unsloth ์ตœ์ ํ™”)
target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj
bias: none
task_type: CAUSAL_LM
use_rslora: false

ํ›ˆ๋ จ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ:

ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’ ์„ค๋ช…
ํ”„๋ ˆ์ž„์›Œํฌ Unsloth ๊ธฐ์กด ๋Œ€๋น„ 2-5๋ฐฐ ๋น ๋ฅธ ํ›ˆ๋ จ
Epochs 3 (1.94์—์„œ ์ค‘๋‹จ) ์ตœ์  ์ง€์ ์—์„œ ์กฐ๊ธฐ ์ข…๋ฃŒ
Batch Size 8 per device H100E ๋ฉ”๋ชจ๋ฆฌ ์ตœ๋Œ€ ํ™œ์šฉ
Gradient Accumulation 2 ์‹ค์งˆ์  ๋ฐฐ์น˜ ํฌ๊ธฐ 16
Learning Rate 1e-4 ๊ท ํ˜•์žกํžŒ ํ•™์Šต๋ฅ 
Max Sequence Length 4096 ํ™•์žฅ๋œ ๋ฌธ๋งฅ ์ง€์›
Warmup Ratio 0.05 ๋น ๋ฅธ ์›Œ๋ฐ์—…
Weight Decay 0.01 ์ •๊ทœํ™”
Optimizer AdamW 8-bit (Unsloth) ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”
LR Scheduler Cosine ๋ถ€๋“œ๋Ÿฌ์šด ๊ฐ์†Œ
Gradient Checkpointing Unsloth ์ตœ์ ํ™” ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ

์ฒดํฌํฌ์ธํŠธ ์„ ํƒ ์ „๋žต:

3 epoch ํ›ˆ๋ จ์„ ์ง„ํ–‰ํ–ˆ์ง€๋งŒ, ํ‰๊ฐ€ ์†์‹ค(evaluation loss) ๋ถ„์„์„ ํ†ตํ•ด **checkpoint-6250 (Epoch 1.94)**์„ ์„ ํƒํ–ˆ์Šต๋‹ˆ๋‹ค:

์ฒดํฌํฌ์ธํŠธ Epoch Training Loss Eval Loss ์ƒํƒœ
6250 1.94 0.9986 1.4604 โœ… ์„ ํƒ (์ตœ์ )
6500 2.02 0.561 1.5866 โŒ ๊ณผ์ ํ•ฉ

ํ•ต์‹ฌ ์ธ์‚ฌ์ดํŠธ: Training loss๋Š” ๊ณ„์† ๊ฐ์†Œํ–ˆ์ง€๋งŒ, checkpoint-6250 ์ดํ›„ evaluation loss๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๊ณผ์ ํ•ฉ์˜ ์‹ ํ˜ธ์ž…๋‹ˆ๋‹ค. ๊ฐ€์žฅ ๋‚ฎ์€ evaluation loss๋ฅผ ๊ฐ€์ง„ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์„ ํƒํ•˜์—ฌ ์ตœ์ ์˜ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ํ™•๋ณดํ–ˆ์Šต๋‹ˆ๋‹ค.

๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”:

  • Full precision ํ›ˆ๋ จ (H100E์—์„œ 4-bit ์–‘์žํ™” ๋ถˆํ•„์š”)
  • Unsloth gradient checkpointing
  • H100E ์ตœ์ ํ™” BF16 ํ›ˆ๋ จ
  • ํ›ˆ๋ จ ์ค‘ Peak VRAM ์‚ฌ์šฉ๋Ÿ‰: ~26GB

ํ›ˆ๋ จ ํ™˜๊ฒฝ:

  • GPU: NVIDIA H100 80GB HBM3
  • ํ”„๋ ˆ์ž„์›Œํฌ: Unsloth + PyTorch 2.6, Transformers 4.46.3
  • ํ›ˆ๋ จ ์‹œ๊ฐ„: ~3์‹œ๊ฐ„ (Unsloth ๊ฐ€์†์œผ๋กœ 6,250 steps)
  • ์ตœ์ข… ์ฒดํฌํฌ์ธํŠธ: Step 6250 (Epoch 1.94), ์ „์ฒด ๋ชจ๋ธ๋กœ ๋ณ‘ํ•ฉ

์„ฑ๋Šฅ ์˜ˆ์‹œ

๋ฐ˜๋ง์—์„œ ์กด๋Œ“๋ง ๋ณ€ํ™˜:

์ž…๋ ฅ (๋ฐ˜๋ง): "WMS๊ฐ€ ๋ญ์•ผ?"

์ถœ๋ ฅ (์กด๋Œ“๋ง): "WMS๋Š” Warehouse Management System์˜ ์•ฝ์ž๋กœ, ์ฐฝ๊ณ  ๊ด€๋ฆฌ ์‹œ์Šคํ…œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์žฌ๊ณ  ์ถ”์ , ์ž…์ถœ๊ณ  ๊ด€๋ฆฌ, ํ”ผํ‚น, ํŒจํ‚น ๋“ฑ์˜ ๋ฌผ๋ฅ˜ ํ”„๋กœ์„ธ์Šค๋ฅผ ์ž๋™ํ™”ํ•˜๊ณ  ์ตœ์ ํ™”ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. ํšจ์œจ์ ์ธ ์ฐฝ๊ณ  ์šด์˜์„ ์œ„ํ•ด ์‚ฌ์šฉ๋˜๋ฉฐ, ์‹ค์‹œ๊ฐ„ ์žฌ๊ณ  ๊ฐ€์‹œ์„ฑ๊ณผ ์ž‘์—… ์ƒ์‚ฐ์„ฑ ํ–ฅ์ƒ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค."

์ฝ”๋“œ ์ƒ์„ฑ:

์ž…๋ ฅ: "ํŒŒ์ด์ฌ์œผ๋กœ ๋ฆฌ์ŠคํŠธ๋ฅผ ์—ญ์ˆœ์œผ๋กœ ๋งŒ๋“ค์–ด์ค˜"

์ถœ๋ ฅ: reverse() ๋ฉ”์„œ๋“œ, ์Šฌ๋ผ์ด์‹ฑ, reversed() ํ•จ์ˆ˜ ๋“ฑ ์„ธ ๊ฐ€์ง€ ํŒŒ์ด์ฌ ๋ฆฌ์ŠคํŠธ ์—ญ์ˆœ ๋ณ€ํ™˜ ๋ฐฉ๋ฒ•์„ ๊ฐ ์ ‘๊ทผ๋ฒ•์˜ ์ฐจ์ด์ ๊ณผ ํ•จ๊ป˜ ์ƒ์„ธํžˆ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ

์ด ๋ชจ๋ธ์€ ํ‘œ์ค€ EEVE ํ…œํ”Œ๋ฆฟ ํ˜•์‹์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค:

template = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
Human: {user_message}
Assistant: """

์ตœ์ ์˜ ์„ฑ๋Šฅ์„ ์œ„ํ•ด์„œ๋Š” ์ด ํ…œํ”Œ๋ฆฟ์„ ์ •ํ™•ํžˆ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค.

๊ถŒ์žฅ ์ƒ์„ฑ ํŒŒ๋ผ๋ฏธํ„ฐ

generation_config = {
    "max_new_tokens": 512,
    "temperature": 0.3,
    "top_p": 0.85,
    "repetition_penalty": 1.0,
    "do_sample": True,
    "pad_token_id": tokenizer.pad_token_id,
    "eos_token_id": tokenizer.eos_token_id,
}

ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ • ๊ฐ€์ด๋“œ:

์šฉ๋„ Temperature Top P Repetition Penalty ๋น„๊ณ 
์ •ํ™•ํ•œ ๋‹ต๋ณ€ 0.1-0.3 0.8-0.9 1.0 ์‚ฌ์‹ค ๊ธฐ๋ฐ˜ Q&A์— ์ตœ์ 
๊ท ํ˜•์žกํžŒ ๋‹ต๋ณ€ 0.5-0.7 0.85-0.95 1.0 ๊ถŒ์žฅ ๊ธฐ๋ณธ๊ฐ’
์ฐฝ์˜์  ๋‹ต๋ณ€ 0.8-1.0 0.9-1.0 1.05-1.1 ์ฐฝ์ž‘ ๊ธ€์“ฐ๊ธฐ์šฉ

Repetition Penalty ์ค‘์š” ์ฐธ๊ณ ์‚ฌํ•ญ:

  • ๊ธฐ๋ณธ๊ฐ’ (1.0): ํŽ˜๋„ํ‹ฐ ์—†์Œ, ์ž์—ฐ์Šค๋Ÿฌ์šด ๋ฐ˜๋ณต ํ—ˆ์šฉ
  • ์•ฝํ•จ (1.05-1.1): ์ฐฝ์ž‘ ์ž‘์—…์—์„œ ๋ฏธ์„ธํ•œ ๋ฐ˜๋ณต ๊ฐ์†Œ
  • ์ค‘๊ฐ„ (1.1-1.2): ๋ฐ˜๋ณต์ ์ธ ๊ตฌ๋ฌธ ๊ฐ์†Œ์— ํšจ๊ณผ์ 
  • ๊ฐ•ํ•จ (1.2+): ์ถœ๋ ฅ ํ’ˆ์งˆ ์ €ํ•˜ ๊ฐ€๋Šฅ, ์ฃผ์˜ํ•ด์„œ ์‚ฌ์šฉ

โš ๏ธ ์ฃผ์˜: repetition_penalty๋ฅผ 1.2 ์ด์ƒ์œผ๋กœ ์„ค์ •ํ•˜๋ฉด ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ ํ’ˆ์งˆ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ์šฉ ์‚ฌ๋ก€์—์„œ 1.0-1.1์ด ์ตœ์ ์ž…๋‹ˆ๋‹ค.

๊ณ ๊ธ‰ ์„ค์ • ์˜ˆ์‹œ:

# ์ฝ”๋“œ ์ƒ์„ฑ์šฉ
code_gen_config = {
    "max_new_tokens": 1024,
    "temperature": 0.2,
    "top_p": 0.9,
    "repetition_penalty": 1.0,
    "do_sample": True,
}

# ๋Œ€ํ™”ํ˜• ์‘๋‹ต์šฉ
conversation_config = {
    "max_new_tokens": 512,
    "temperature": 0.7,
    "top_p": 0.9,
    "repetition_penalty": 1.05,
    "do_sample": True,
}

# ์ •ํ™•ํ•œ ์‚ฌ์‹ค ๋‹ต๋ณ€์šฉ
factual_config = {
    "max_new_tokens": 256,
    "temperature": 0.1,
    "top_p": 0.85,
    "repetition_penalty": 1.0,
    "do_sample": True,
}

์ œํ•œ์‚ฌํ•ญ

์ด ๋ชจ๋ธ์€ ์—ฐ๊ตฌ ๋ฐ ๊ต์œก ๋ชฉ์ ์œผ๋กœ ๊ณต๊ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ƒ์—…์  ์‚ฌ์šฉ ์‹œ CC-BY-NC-SA-4.0 ๋ผ์ด์„ ์Šค๋ฅผ ์ค€์ˆ˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํ•œ๊ตญ์–ด์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์œผ๋‚˜ ๋‹ค๋ฅธ ์–ธ์–ด๋„ ๋ถ€๋ถ„์ ์œผ๋กœ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

๋ผ์ด์„ ์Šค

  • ๋ชจ๋ธ ๋ผ์ด์„ ์Šค: CC-BY-NC-SA-4.0
  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: EEVE-Korean-Instruct-10.8B-v1.0 ๋ผ์ด์„ ์Šค ์ค€์ˆ˜
  • ์ƒ์—…์  ์‚ฌ์šฉ: ์ œํ•œ์  (๋ผ์ด์„ ์Šค ์ฐธ์กฐ)

์ธ์šฉ

@misc{eeve-vss-smh-2025,
  author = {MyeongHo0621},
  title = {EEVE-VSS-SMH: Korean Custom Fine-tuned Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/MyeongHo0621/eeve-vss-smh}},
  note = {LoRA fine-tuned and merged model based on EEVE-Korean-Instruct-10.8B-v1.0}
}

๊ฐ์‚ฌ์˜ ๊ธ€

  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: Yanolja - EEVE-Korean-Instruct-10.8B-v1.0
  • ํ›ˆ๋ จ ์ธํ”„๋ผ: KT Cloud H100E
  • ํ”„๋ ˆ์ž„์›Œํฌ: Hugging Face Transformers, PEFT

์—ฐ๋ฝ์ฒ˜


Last Updated: 2025-10-12
Checkpoint: 6250 steps (Epoch 1.94)
Training Method: Unsloth (2-5x faster)
Selection Criteria: Lowest Evaluation Loss (1.4604)
Status: Merged & Ready for Deployment

Downloads last month
6
Safetensors
Model size
11B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for MyeongHo0621/eeve-vss-smh