Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_3_10_noise_btw_7_8_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 1 day ago • 37
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_len_3_10_noiselen_leq_5_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 2 days ago • 48
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_len_3_10_leven_1_1p0_0p5_1p0_grpo_42_targeted Text Generation • 2B • Updated 2 days ago • 70
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_len_3_10_leven_1_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 2 days ago • 59
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_len_3_10_leven_1_1p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 2 days ago • 65
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_len_3_10_1p0_1p0_1p0_grpo_42_uniform Updated 2 days ago
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_spell_backward_len_3_10_1p0_0p0_1p0_grpo_42_uniform Text Generation • 2B • Updated 2 days ago • 8
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_chain_sum_2048_1p0_0p0_1p0_grpo_42_uniform Text Generation • 2B • Updated 5 days ago • 37
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_chain_sum_correct_run_1p0_0p0_1p0_grpo_42_uniform Updated 5 days ago
Kazuki1450/Qwen2.5-1.5B-Instruct_rgym_chain_sum_1p0_0p0_1p0_grpo_42_uniform Text Generation • 2B • Updated 5 days ago • 57
Kazuki1450/Qwen2.5-1.5B-Instruct_lightr1_stage1_1p0_0p0_1p0_grpo_42_uniform Text Generation • 2B • Updated 6 days ago • 113
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_relerr0.1_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 14
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_relerr0.1_1p0_0p5_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 10
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_relerr0.1_1p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 14
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_format_0p5_0p5_1p0_grpo_42_format Text Generation • 2B • Updated 8 days ago • 60
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_1p0_1p0_1p0_grpo_42_format Text Generation • 2B • Updated 8 days ago • 12
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_format_1p0_1p0_1p0_grpo_42_format Updated 8 days ago
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_terms_eq_15_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 50
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_terms_eq_6_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 47
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_terms_leq_10_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 38
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_terms_eq_10_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 42
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_flip_terms_geq_11_0p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 8 days ago • 42
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_terms_geq_11_1p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 9 days ago • 44
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_terms_leq_10_1p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 9 days ago • 38
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_terms_eq_15_1p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 9 days ago • 51
Kazuki1450/Qwen3-1.7B-Base_rgym_chain_sum_6_15_6_15_terms_eq_10_1p0_1p0_1p0_grpo_42_targeted Text Generation • 2B • Updated 9 days ago • 3