30 38 87

Akarshan Biswas

qnixsynapse

qnixsynapse

AI & ML interests

NLP, models, quantization

Recent Activity

upvoted a paper 29 days ago

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

new activity 3 months ago

janhq/Jan-v1-4B:Invalid version/backend format: none. Expected format: <version>/<backend>

liked a model 3 months ago

openai/gpt-oss-20b

View all activity

Organizations

None yet

upvoted a paper 29 days ago

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published about 1 month ago • 56

upvoted an article 3 months ago

Article

Introduction to State Space Models (SSM)

Jul 19, 2024

•

190

upvoted a paper 8 months ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7, 2024 • 22

upvoted a collection 8 months ago

Gemma 3 Release

Collection

28 items • Updated Aug 11 • 533

upvoted a paper 9 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 151

upvoted a paper 11 months ago

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 96

upvoted 2 papers about 1 year ago

Kolmogorov-Arnold Transformer

Paper • 2409.10594 • Published Sep 16, 2024 • 45

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51

upvoted an article over 1 year ago

Article

Tool Use, Unified

Aug 12, 2024

•

117

upvoted 2 papers over 1 year ago

Language Model Can Listen While Speaking

Paper • 2408.02622 • Published Aug 5, 2024 • 42

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117

upvoted a collection over 1 year ago

Gemma 2 2B Release

Collection

The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 10 • 80

upvoted 3 papers over 1 year ago

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12, 2024 • 62

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Paper • 2407.03963 • Published Jul 4, 2024 • 19

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

Paper • 2407.02490 • Published Jul 2, 2024 • 27

upvoted a collection over 1 year ago

SSMs

Collection

A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. • 5 items • Updated 5 days ago • 29

upvoted 4 papers over 1 year ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8, 2024 • 9

Akarshan Biswas

AI & ML interests

Recent Activity

Organizations

qnixsynapse's activity

Introduction to State Space Models (SSM)

Tool Use, Unified