Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
47
217
Timex Peachtree
TimexPeachtree
Follow
Gargaz's profile picture
Ameeeee's profile picture
0xSojalSec's profile picture
6 followers
ยท
10 following
TimexPeachtree
AI & ML interests
None yet
Recent Activity
liked
a Space
2 days ago
codelion/pts-visualizer
reacted
to
codelion
's
post
with ๐
2 days ago
Introducing PTS Visualizer - an interactive tool for exploring how language models reason! Visualize pivotal tokens, thought anchors, and reasoning circuits. See which tokens and sentences significantly impact success probability, explore embedding clusters, and trace reasoning step-by-step. Try it: https://huggingface.co/spaces/codelion/pts-visualizer Explore PTS datasets: - Qwen3-0.6B: https://huggingface.co/datasets/codelion/Qwen3-0.6B-pts - DeepSeek-R1: https://huggingface.co/datasets/codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts Or upload your own JSONL files! GitHub: https://github.com/codelion/pts
reacted
to
codelion
's
post
with ๐
2 days ago
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: โ Depth beats width: 32 layers outperforms 12 layers at the same parameter count โ Best-in-class factuality: 47.5% on TruthfulQA โ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion โ Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m
View all activity
Organizations
None yet
spaces
2
Sort:ย Recently updated
Running
deep-enquiry-tw
๐ณ
Running
deep-enquiry
๐ณ
models
0
None public yet
datasets
0
None public yet