Maksym Andriushchenko's picture

2 7 20

Maksym Andriushchenko

MaksymAndriushchenko

·

https://www.andriushchenko.me/

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

upvoted a paper about 2 months ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

authored a paper about 2 months ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

View all activity

Organizations

upvoted a paper about 1 month ago

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Paper • 2510.09462 • Published Oct 10 • 5

upvoted 2 papers about 2 months ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Paper • 2509.18058 • Published Sep 22 • 12

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Paper • 2509.09677 • Published Sep 11 • 34

upvoted a paper 5 months ago

OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents

Paper • 2506.14866 • Published Jun 17 • 5

upvoted a paper 6 months ago

Capability-Based Scaling Laws for LLM Red-Teaming

Paper • 2505.20162 • Published May 26 • 4

upvoted a paper 7 months ago

Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17 • 59

upvoted a paper over 1 year ago

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7, 2024 • 5