Overview

Newstar-Qwen3-0.6B is a finetuned version of the Qwen3-0.6B base model. It uses Newstar’s instruction tuning on top of Qwen3’s pretrained weights. The tuning was done using the ITP-v2 dataset.

This model is designed without thinking capabilities. It intentionally avoids the reasoning mode that Qwen3 supports. Its primary purpose is to offer a Qwen3-based model that focuses on straightforward instruction following without engaging in complex reasoning.

Model Details

  • Base model: Qwen3-0.6B (causal language model)
  • Finetuning: Instruction tuning by Newstar on ITP-v2 dataset
  • Parameters: 0.6 billion
  • Layers: 28
  • Attention heads: 16 Q, 8 KV (GQA)
  • Context length: 32,768 tokens

Intended Use

  • General instruction following without reasoning or "thinking"
  • Simple and efficient dialogue or text generation tasks
  • Scenarios where disabling complex logical or mathematical reasoning is preferred

Performance and Limitations

  • Benchmarks: Not yet tested
  • Informally observed to be less capable than Qwen3 Instruct in reasoning and complex tasks
  • Lacks the advanced thinking mode of the original Qwen3 model
  • Use when reasoning is not required or to avoid thinking mode overhead

How It Differs from Qwen3-0.6B

  • Qwen3 supports both thinking and non-thinking modes
  • Newstar-Qwen3-0.6B disables thinking mode completely
  • Less capable in math, coding, and logic but simpler for basic instructions

Usage Notes

  • Thinking mode (enable_thinking=True) is disabled
  • Model does not generate <think>...</think> reasoning blocks
  • Recommended generation settings (non-thinking mode):
    • Temperature: 0.7
    • Top-p: 0.95
    • Top-k: 64
    • Rep-penalty: 1.03

Citations

@misc{qwen3technicalreport,
  title={Qwen3 Technical Report}, 
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2505.09388}, 
}
Downloads last month
9
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NewstaR/Newstar-Qwen3-0.6B

Finetuned
(393)
this model
Finetunes
1 model
Quantizations
1 model

Dataset used to train NewstaR/Newstar-Qwen3-0.6B