Maksym Andriushchenko's picture

2 7 20

Maksym Andriushchenko

MaksymAndriushchenko

·

https://www.andriushchenko.me/

AI & ML interests

None yet

Recent Activity

upvoted a paper 27 days ago

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

upvoted a paper about 2 months ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

authored a paper about 2 months ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

View all activity

Organizations

authored a paper about 2 months ago

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM

Paper • 2509.18058 • Published Sep 22 • 12

authored 2 papers over 1 year ago

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Paper • 2404.01318 • Published Mar 28, 2024

Layer-wise Linear Mode Connectivity

Paper • 2307.06966 • Published Jul 13, 2023

authored 2 papers over 2 years ago

A Modern Look at the Relationship between Sharpness and Generalization

Paper • 2302.07011 • Published Feb 14, 2023

SGD with Large Step Sizes Learns Sparse Features

Paper • 2210.05337 • Published Oct 11, 2022