Ankit Shah's picture

Ankit Shah

ankits0052

·

https://ankitshah009.github.io/

AI & ML interests

Artificial Intelligence, Deep Learning, Machine Perception, Machine Learning, Audio Processing

Recent Activity

liked a dataset 21 days ago

cais/hle

upvoted an article 28 days ago

We Got Claude to Fine-Tune an Open Source LLM

liked a model about 1 month ago

nvidia/Cosmos-Guardrail1

View all activity

Organizations

upvoted an article 28 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

30 days ago

•

554

upvoted 2 papers 2 months ago

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Paper • 2504.11468 • Published Apr 10, 2025 • 30

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 123

upvoted a collection 4 months ago

Cosmos-Predict2

World Foundation Model for Future Prediction • 13 items • Updated 10 days ago • 33

upvoted a paper 4 months ago

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63

upvoted an article 5 months ago

Article

Small Language Models (SLM): A Comprehensive Overview

Feb 22, 2025

•

115

upvoted a paper 5 months ago

Wan: Open and Advanced Large-Scale Video Generative Models

Paper • 2503.20314 • Published Mar 26, 2025 • 56

upvoted 3 collections 5 months ago

Physical AI

Collection of open, commercial-grade datasets for physical AI developers • 23 items • Updated 10 days ago • 103

AceReason

Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 10 days ago • 20

Reward Models 06-2025

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 10 days ago • 23

upvoted 2 papers 5 months ago

Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms

Paper • 2310.07161 • Published Oct 11, 2023 • 1

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 158

upvoted 3 collections 5 months ago

Qwen3

84 items • Updated 3 days ago • 1.53k

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 3 days ago • 126

OpenReasoning-Nemotron

Collection of models for OpenReasoning-Nemotron which are trained on 5M reasoning traces for Math, Code and Science. • 6 items • Updated 10 days ago • 46

upvoted 2 collections 7 months ago

Meta's Llama 3.3 models & evals

2 items • Updated Dec 13, 2024 • 85

Llama 4

Llama 4 release • 13 items • Updated Apr 29, 2025 • 677

upvoted an article 7 months ago

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.02k

upvoted a collection 8 months ago

Cosmos

⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos-predict25 • 31 items • Updated 10 days ago • 299

upvoted an article 11 months ago

Article

Open-source DeepResearch – Freeing our search agents

+3

Feb 4, 2025

•

1.31k