Overview

Newstar-Qwen3-0.6B is a finetuned version of the Qwen3-0.6B base model. It uses Newstar’s instruction tuning on top of Qwen3’s pretrained weights. The tuning was done using the ITP-v2 dataset.

This model is designed without thinking capabilities. It intentionally avoids the reasoning mode that Qwen3 supports. Its primary purpose is to offer a Qwen3-based model that focuses on straightforward instruction following without engaging in complex reasoning.

Model Details

Base model: Qwen3-0.6B (causal language model)
Finetuning: Instruction tuning by Newstar on ITP-v2 dataset
Parameters: 0.6 billion
Layers: 28
Attention heads: 16 Q, 8 KV (GQA)
Context length: 32,768 tokens

Intended Use

General instruction following without reasoning or "thinking"
Simple and efficient dialogue or text generation tasks
Scenarios where disabling complex logical or mathematical reasoning is preferred

Performance and Limitations

Benchmarks: Not yet tested
Informally observed to be less capable than Qwen3 Instruct in reasoning and complex tasks
Lacks the advanced thinking mode of the original Qwen3 model
Use when reasoning is not required or to avoid thinking mode overhead

How It Differs from Qwen3-0.6B

Qwen3 supports both thinking and non-thinking modes
Newstar-Qwen3-0.6B disables thinking mode completely
Less capable in math, coding, and logic but simpler for basic instructions

Usage Notes

Thinking mode (enable_thinking=True) is disabled
Model does not generate <think>...</think> reasoning blocks
Recommended generation settings (non-thinking mode):
- Temperature: 0.7
- Top-p: 0.95
- Top-k: 64
- Rep-penalty: 1.03

Citations

@misc{qwen3technicalreport,
  title={Qwen3 Technical Report}, 
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2505.09388}, 
}

Downloads last month: 9

Safetensors

Model size

0.6B params

Tensor type

F32

Model tree for NewstaR/Newstar-Qwen3-0.6B

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

(393)

this model

Finetunes

1 model

Quantizations