--- library_name: transformers license: apache-2.0 base_model: Qwen/Qwen3-14B-Base tags: - generated_from_trainer datasets: - winglian/cuda-engineer-augment-v4-filtered - axolotl-ai-internal/gpumode-py2triton-reasoning-v2-filtered model-index: - name: outputs/out results: [] --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.10.0.dev0` ```yaml base_model: Qwen/Qwen3-14B-Base plugins: - axolotl.integrations.liger.LigerPlugin - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin liger_rope: true liger_rms_norm: true liger_glu_activation: true chat_template_jinja: "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within XML tags:\\n\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n\\n\\nFor each function call, return a json object with function name and arguments within XML tags:\\n\\n{\\\"name\\\": , \\\"arguments\\\": }\\n<|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and not(message.content.startswith('') and message.content.endswith('')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set content = message.content %}\n {%- set reasoning_content = '' %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n\\n' }}\n {{- message.content }}\n {{- '\\n' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}" datasets: - path: winglian/cuda-engineer-augment-v4-filtered type: chat_template split: train # split_thinking: true eot_tokens: ["<|im_end|>"] - path: axolotl-ai-internal/gpumode-py2triton-reasoning-v2-filtered type: chat_template split: train # split_thinking: true eot_tokens: ["<|im_end|>"] dataset_prepared_path: last_run_prepared val_set_size: 0.005 output_dir: ./outputs/out save_only_model: true sequence_len: 16384 sample_packing: true pad_to_sequence_len: true wandb_project: qwen3-14b-grpo-triton wandb_entity: axolotl-ai wandb_watch: wandb_name: wandb_log_model: gradient_accumulation_steps: 1 micro_batch_size: 2 num_epochs: 3 optimizer: adamw_torch_fused max_grad_norm: 0.1 neftune_noise_alpha: 10 lr_scheduler: cosine learning_rate: 1e-5 bf16: true tf32: true gradient_checkpointing: offload gradient_checkpointing_kwargs: use_reentrant: false logging_steps: 1 flash_attention: true warmup_steps: 100 evals_per_epoch: 5 saves_per_epoch: 1 weight_decay: 0.01 deepspeed: deepspeed_configs/zero1.json ```

# outputs/out This model is a fine-tuned version of [Qwen/Qwen3-14B-Base](https://huggingface.co/Qwen/Qwen3-14B-Base) on the winglian/cuda-engineer-augment-v4-filtered and the axolotl-ai-internal/gpumode-py2triton-reasoning-v2-filtered datasets. It achieves the following results on the evaluation set: - Loss: 0.2262 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - total_train_batch_size: 16 - total_eval_batch_size: 16 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 100 - num_epochs: 3.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 0.4626 | 0.0056 | 1 | 0.4989 | | 0.3018 | 0.2 | 36 | 0.3577 | | 0.2528 | 0.4 | 72 | 0.2954 | | 0.2273 | 0.6 | 108 | 0.2686 | | 0.2238 | 0.8 | 144 | 0.2540 | | 0.2143 | 1.0 | 180 | 0.2458 | | 0.1964 | 1.2 | 216 | 0.2387 | | 0.1913 | 1.4 | 252 | 0.2357 | | 0.1809 | 1.6 | 288 | 0.2327 | | 0.1814 | 1.8 | 324 | 0.2296 | | 0.1769 | 2.0 | 360 | 0.2271 | | 0.1638 | 2.2 | 396 | 0.2253 | | 0.1594 | 2.4 | 432 | 0.2257 | | 0.154 | 2.6 | 468 | 0.2262 | | 0.1578 | 2.8 | 504 | 0.2262 | | 0.1571 | 3.0 | 540 | 0.2262 | ### Framework versions - Transformers 4.51.3 - Pytorch 2.6.0+cu124 - Datasets 3.5.1 - Tokenizers 0.21.1