AI & ML interests
None defined yet.
Recent Activity
Organization Card
Tina: Tiny Reasoning Models via LoRA
Tina is the family of models created by post-training the DeepSeek-R1-Distill-Qwen-1.5B base model using low-rank adaptation (LoRA) during reinforcement learning (RL), on open-source reasoning datasets.
- Paper: https://arxiv.org/abs/2504.15777
- Notion Blog: https://shangshangwang.notion.site/tina
- Code Repository: https://github.com/shangshang-wang/Tina
- Training Logs: https://wandb.ai/upup-ashton-wang-usc/Tina
Tina's avatar is generated by GPT-4o based on KYNE's girls and the following prompt.
Hi, I’m Tina — an INTJ who’s all about getting to the essence of things. I study reasoning models because I’m fascinated by how structured thinking and logic can emerge from data. Outside of that, I recharge with movies, music, and the occasional gaming session. I believe in strategic effort: minimal input, maximum impact — whether it’s in research or everyday life, I’m always looking for the most efficient path to meaningful results.
models
18
Tina-Yi/R1-Distill-Qwen-1.5B-II-Thought-1.5B-Preview
Question Answering
•
Updated
•
1
Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS3
Question Answering
•
Updated
•
2
Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS2
Question Answering
•
Updated
•
2
Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS1
Question Answering
•
Updated
•
1
Tina-Yi/R1-Distill-Qwen-1.5B-DeepScaleR
Question Answering
•
Updated
•
1
Tina-Yi/R1-Distill-Qwen-1.5B-STILL
Question Answering
•
Updated
•
1
Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS3-long-completion
Question Answering
•
Updated
Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS3-format-only
Question Answering
•
Updated
Tina-Yi/R1-Distill-Qwen-1.5B-Open-RS3-DrGRPO
Question Answering
•
Updated
Tina-Yi/R1-Distill-Qwen-1.5B-LIMR-64-LoRA-rank
Question Answering
•
Updated
datasets
0
None public yet