Sean McLeish's picture

9 35 3

Sean McLeish PRO

smcleish

·

https://mcleish7.github.io/

AI & ML interests

None yet

Recent Activity

updated a model about 23 hours ago

Leon-Sean-Dev/0.6_4b_eos_causal_embed

updated a model about 23 hours ago

Leon-Sean-Dev/4_4b_eos_causal_embed

updated a model about 23 hours ago

Leon-Sean-Dev/0.6_4b_mean_causal_embed

View all activity

Organizations

upvoted a paper 15 days ago

Learning Unmasking Policies for Diffusion Language Models

Paper • 2512.09106 • Published 18 days ago • 8

upvoted 2 papers about 2 months ago

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

Paper • 2511.07384 • Published Nov 10 • 16

Efficient Parallel Samplers for Recurrent-Depth Models and Their Connection to Diffusion Language Models

Paper • 2510.14961 • Published Oct 16 • 7

upvoted 4 papers 3 months ago

Training Dynamics Impact Post-Training Quantization Robustness

Paper • 2510.06213 • Published Oct 7 • 3

Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models

Paper • 2510.02300 • Published Oct 2 • 6

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Paper • 2509.18058 • Published Sep 22 • 12

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Paper • 2509.09677 • Published Sep 11 • 34

upvoted 2 papers 4 months ago

FAST: Factorizable Attention for Speeding up Transformers

Paper • 2402.07901 • Published Feb 12, 2024 • 3

DynaGuard: A Dynamic Guardrail Model With User-Defined Policies

Paper • 2509.02563 • Published Sep 2 • 20

upvoted a paper 5 months ago

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published Jul 22 • 35

upvoted 8 papers 7 months ago

ARGUS: Hallucination and Omission Evaluation in Video-LLMs

Paper • 2506.07371 • Published Jun 9 • 8

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Paper • 2506.05523 • Published Jun 5 • 34

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning

Paper • 2506.04245 • Published May 29 • 4

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5 • 59

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Paper • 2505.22664 • Published May 28 • 7

How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning

Paper • 2505.24273 • Published May 30 • 5

Pitfalls in Evaluating Language Model Forecasters

Paper • 2506.00723 • Published May 31 • 3

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 27

upvoted 2 papers 8 months ago

ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations

Paper • 2505.02819 • Published May 5 • 26

The Leaderboard Illusion

Paper • 2504.20879 • Published Apr 29 • 72