Dominique Mariko's picture

15 32

Dominique Mariko PRO

tiptales

·

tiptales

AI & ML interests

None yet

Recent Activity

updated a collection about 2 months ago

upvoted a paper about 2 months ago

Flow-GRPO: Training Flow Matching Models via Online RL

upvoted a paper about 2 months ago

Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

View all activity

Organizations

upvoted 7 papers about 2 months ago

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8 • 85

Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28 • 46

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 123

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 224

A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25 • 344

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10 • 673

UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Paper • 2509.11543 • Published Sep 15 • 47

upvoted 2 collections 4 months ago

Releases July 4

25 items • Updated Jul 7 • 7

📐 FineMath

FineMath datasets and ablation models • 14 items • Updated May 5 • 24

upvoted a collection 5 months ago

🥂 FineWeb2

3 items • Updated Jun 27 • 21

upvoted a paper 5 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 75

upvoted an article 9 months ago

Article

Open R1: Update #2

Feb 10

•

218

upvoted a paper 9 months ago

Fully Autonomous AI Agents Should Not be Developed

Paper • 2502.02649 • Published Feb 4 • 35

upvoted 2 collections over 1 year ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 238

Probably function calling datasets

Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 39