add robbiemu submission for unit1

#36
by robbiemu - opened

train:

hf jobs uv run \
  --flavor "a10g-large" \
  --timeout "72h" \
  --secrets HF_TOKEN \
  "https://raw.githubusercontent.com/huggingface/trl/main/trl/scripts/sft.py" \
  --model_name_or_path "mistralai/Mistral-7B-Instruct-v0.3" \
  --dataset_name "${DATASET_NAME}" \
  --learning_rate 2e-5 \
  --per_device_train_batch_size 4 \
  --gradient_accumulation_steps 4 \
  --max_steps 1500 \
  --eval_steps 100 \
  --save_steps 250 \
  --seed 42 \
  --output_dir "outputs" \
  --hub_model_id "${HUB_MODEL_ID}" \
  --report_to none \
  --use_peft \
  --lora_r 64 \
  --lora_alpha 128 \
  --lora_dropout 0.05 \
  --lora_target_modules "q_proj" "v_proj" \
  --bf16 True \
  --push_to_hub

evaluate:

hf jobs uv run \
  --flavor "a10g-large" \
  --timeout "60m" \
  --with "git+https://github.com/huggingface/lighteval@main#egg=lighteval[vllm,gsm8k]" \
  --with emoji \
  --secrets HF_TOKEN \
  lighteval vllm \
  "model_name=robbiemu/smollm3-sft-math-tuned,revision=main" \
  "lighteval|gsm8k|0" \
  --push-to-hub \
  --results-org "robbiemu" \
  --results-path-template "{org}/details_{org}__{model}_private"

This project was completed using the Hugging Face jobs of course. The workflow was broken down into four distinct steps:

  • Dataset Preparation: Pre-processing the meta-math/MetaMathQA dataset into the required format.
  • Training: Fine-tuning the SmolLM3-3B-Base model on the formatted dataset.
  • Merging: Merging the resulting LoRA adapter with the base model to create the final, standalone model.
  • Evaluation: Running the merged model against the gsm8k benchmark using lighteval.

The specific work done in each step was written in a script, and published with the model.

here are logs from each script if desired:

step 1:

hf download meta-math/MetaMathQA --repo-type dataset
Fetching 3 files:   0%|                                                                             | 0/3 [00:00<?, ?it/s]Downloading 'MetaMathQA-395K.json' to '/Users/macdev/.cache/huggingface/hub/datasets--meta-math--MetaMathQA/blobs/fb39a5d8c05c042ece92eae37dfd5ea414a5979df2bf3ad3b86411bef8205725.incomplete'
                                                                                                                         Downloading '.gitattributes' to '/Users/macdev/.cache/huggingface/hub/datasets--meta-math--MetaMathQA/blobs/b27e38260e3bcf0169f26ca70c4dc1c1bab8ed16.incomplete'
.gitattributes: 2.47kB [00:00, 11.2MB/s]
Download complete. Moving file to /Users/macdev/.cache/huggingface/hub/datasets--meta-math--MetaMathQA/blobs/b27e38260e3bcf0169f26ca70c4dc1c1bab8ed160, ?B/s]
MetaMathQA-395K.json: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 396M/396M [00:35<00:00, 11.2MB/s]
Download complete. Moving file to /Users/macdev/.cache/huggingface/hub/datasets--meta-math--MetaMathQA/blobs/fb39a5d8c05c042ece92eae37dfd5ea414a5979df2bf3ad3b86411bef8205725
Fetching 3 files: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 3/3 [00:36<00:00, 12.17s/it]
/Users/macdev/.cache/huggingface/hub/datasets--meta-math--MetaMathQA/snapshots/aa4f34d3d2d3231299b5b03d9b3e5a20da45aa18

python3 format_dataset.py
Loading original dataset 'meta-math/MetaMathQA'...
Generating train split: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 395000/395000 [00:02<00:00, 188589.63 examples/s]
Formatting dataset...
Map: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 395000/395000 [00:05<00:00, 71473.59 examples/s]
Pushing formatted dataset to 'robbiemu/MetaMathQA-formatted'...
Creating parquet from Arrow format: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 4/4 [00:00<00:00,  5.62ba/s]
Processing Files (1 / 1)      : 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ|  173MB /  173MB, 2.77MB/s  
New Data Upload               : 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ|  173MB /  173MB, 2.77MB/s  
                              : 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ|  173MB /  173MB            
Uploading the dataset shards: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1/1 [00:56<00:00, 56.55s/ shards]

โœ… Success! Your formatted dataset is ready on the Hub.
You can now update your train.sh script.

step 2:

./train.sh
Submitting training job:
  Hub model id : robbiemu/smollm3-sft-math-tuned
  Base model   : HuggingFaceTB/SmolLM3-3B-Base
  Dataset      : robbiemu/MetaMathQA-formatted
  Flavor       : a10g-large
  Max steps    : 1000
/Users/Shared/Public/Huggingface/fine_tuning_course/.venv/lib/python3.12/site-packages/huggingface_hub/utils/_experimental.py:60: UserWarning: 'HfApi.run_uv_job' is experimental and might be subject to breaking changes in the future without prior notice. You can disable this warning by setting `HF_HUB_DISABLE_EXPERIMENTAL_WARNING=1` as environment variable.
  warnings.warn(
Job started with ID: 68f3c61a4f84313f47b7d8d0
View at: https://huggingface.co/jobs/robbiemu/68f3c61a4f84313f47b7d8d0
...
Training job queued. Monitor progress on the Hugging Face Hub Jobs dashboard.

step3:

./merge.sh
Submitting LoRA merge job:
  Base model      : HuggingFaceTB/SmolLM3-3B-Base
  Adapter repo    : robbiemu/smollm3-sft-math-tuned
  Final Hub repo  : robbiemu/smollm3-sft-math-tuned
  Hardware        : a10g-large
/Users/Shared/Public/Huggingface/fine_tuning_course/.venv/lib/python3.12/site-packages/huggingface_hub/utils/_experimental.py:60: UserWarning: 'HfApi.run_uv_job' is experimental and might be subject to breaking changes in the future without prior notice. You can disable this warning by setting `HF_HUB_DISABLE_EXPERIMENTAL_WARNING=1` as environment variable.
  warnings.warn(
Job started with ID: 68f4023a4f84313f47b7d8db
View at: https://huggingface.co/jobs/robbiemu/68f4023a4f84313f47b7d8db
...
Loading base model: HuggingFaceTB/SmolLM3-3B-Base
Fetching 2 files: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [00:05<00:00,  2.53s/it]
Loading checkpoint shards: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [00:00<00:00,  2.69it/s]
Loading adapter: robbiemu/smollm3-sft-math-tuned
Merging adapter weights...
Saving merged model locally to 'merged-model'
Uploading merged model to the Hub at robbiemu/smollm3-sft-math-tuned
Processing Files (3 / 3)      : 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 6.17GB / 6.17GB,  177MB/s  
New Data Upload               : 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 5.62GB / 5.62GB,  177MB/s  
  /merged-model/tokenizer.json: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 17.2MB / 17.2MB            
  ...0001-of-00002.safetensors: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 4.97GB / 4.97GB            
  ...0002-of-00002.safetensors: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1.18GB / 1.18GB            
Job complete.
Merge job queued. Check your model repo 'robbiemu/smollm3-sft-math-tuned' for the merged files upon completion.

step 4:

MODEL_NAME=smollm3-sft-math-tuned HUB_MODEL_REVISION=main  ./evaluate.sh
Submitting evaluation job:
  Model          : robbiemu/smollm3-sft-math-tuned
  Results template : {org}/details_{org}__{model}_private
  Hardware       : a10g-large
  Task spec      : lighteval|gsm8k|0
/Users/Shared/Public/Huggingface/fine_tuning_course/.venv/lib/python3.12/site-packages/huggingface_hub/utils/_experimental.py:60: UserWarning: 'HfApi.run_uv_job' is experimental and might be subject to breaking changes in the future without prior notice. You can disable this warning by setting `HF_HUB_DISABLE_EXPERIMENTAL_WARNING=1` as environment variable.
  warnings.warn(
Job started with ID: 68f406b98243113ad33dfc01
View at: https://huggingface.co/jobs/robbiemu/68f406b98243113ad33dfc01
...
Evaluation job queued. Check your datasets under robbiemu/ for the 'details_*' entry once complete.
Recommendation: apply self-consistency by sampling multiple completions locally and majority-voting the final answer.
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment