Description

This repository contains the model for Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning.

Official Implementation

Citation

@article{kim2025meta,
  title={Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning},
  author={Kim, Yoonjeon and Jang, Doohyuk and Yang, Eunho},
  journal={arXiv preprint arXiv:2510.03259},
  year={2025}
}

Downloads last month: 65

Safetensors

Model size

33B params

Tensor type

BF16

Video Preview

Reinforcement Learning

Model tree for jadohu/Qwen2.5-32B-MASA-efficient

Base model

Qwen/Qwen2.5-32B

Finetuned

(100)

this model

Dataset used to train jadohu/Qwen2.5-32B-MASA-efficient

Collection including jadohu/Qwen2.5-32B-MASA-efficient

MASA

Collection

Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning • 7 items • Updated 6 days ago • 1