SmolLM2-FT-MyDataset

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5380

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 3
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.3859 1.1538 50 1.8652
1.2355 2.3077 100 1.9235
1.1499 3.4615 150 1.9949
1.0908 4.6154 200 2.0660
0.9575 5.7692 250 2.1495
0.8631 6.9231 300 2.2484
0.7775 8.0769 350 2.3646
0.6162 9.2308 400 2.5319
0.5879 10.3846 450 2.6362
0.5224 11.5385 500 2.7548
0.4339 12.6923 550 2.8590
0.4305 13.8462 600 2.9796
0.3826 15.0 650 3.0848
0.3383 16.1538 700 3.2293
0.2905 17.3077 750 3.3046
0.2581 18.4615 800 3.3902
0.2367 19.6154 850 3.4507
0.2223 20.7692 900 3.4877
0.2136 21.9231 950 3.5117
0.217 23.0769 1000 3.5380

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.8.0+cu126
  • Datasets 4.2.0
  • Tokenizers 0.19.1
Downloads last month
13
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chairsarecats/SmolLM2-FT-MyDataset

Finetuned
(763)
this model