This repository contains the TinyRecursiveModels checkpoints for arc v1 public eval and arc v2 public eval that were trained for the performance verification. They were trained using the code and recipe of the official TRM repository. We had to adapt the environment setup as detailed below. We provide these checkpoints for transparency and to facilitate further research. We did not contribute to the TRM reserach nor maintain the TRM code. For any questions, please reach out to the TRM maintainers.

TRM writes checkpoints as torch state_dicts. The subdirectories arc_v1_public and arc_v2_public contain the final checkpoints step_<final-step>, which can be loaded with the load_checkpoint or by providing the checkpoint path as load_checkpoint=path/to/checkpoint. For reference, see the PretrainConfig in pretrain.py.

Replication Results

Tiny Recursion Model (TRM) results on ARC-AGI

ARC-AGI-1: 40%, $1.76/task
ARC-AGI-2: 6.2%, $2.10/task

Tweet: https://x.com/arcprize/status/1978872651180577060 Leaderboard: https://arcprize.org/leaderboard

Environment Setup

# use uv for venv
sudo snap install astral-uv --classic
uv venv .venv -p 3.12
source .venv/bin/activate

# install python-dev for adam atan2
sudo apt install python3-dev -y
# install torch
PYTORCH_INDEX_URL=https://download.pytorch.org/whl/cu128
uv pip install torch torchvision torchaudio --index-url $PYTORCH_INDEX_URL
# install dependencies + adam atan
uv pip install packaging ninja wheel setuptools setuptools-scm
uv pip install --no-cache-dir --no-build-isolation adam-atan2 

# test torch, cuda and AdamAtan2
python
import torch
t = torch.tensor([0,1,2]).to('cuda')
from adam_atan2 import AdamATan2

# install remaining dependencies
uv pip install -r requirements.txt

Dataset preprocessing

The repository already contains the raw data, but it needs to be preprocessed. Run the following commands to preprocess the v1 and v2 datasets to make predictions for the public eval datasets.

ARC-AGI-1

python -m dataset.build_arc_dataset \
  --input-file-prefix kaggle/combined/arc-agi \
  --output-dir data/arc1concept-aug-1000 \
  --subsets training evaluation concept \
  --test-set-name evaluation

ARC-AGI-2

python -m dataset.build_arc_dataset \
  --input-file-prefix kaggle/combined/arc-agi \
  --output-dir data/arc2concept-aug-1000 \
  --subsets training2 evaluation2 concept \
  --test-set-name evaluation2

Training

To reproduce the checkpoints, run the following two training runs on a single 8:H100 node. Each run takes ~20-30h. To speed it up, instructions for multi-node training are below.

ARC-AGI-2

run_name="trm_arc_v1_public"
torchrun --nproc-per-node 8 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 pretrain.py \
arch=trm \
data_paths="[data/arc1concept-aug-1000]" \
arch.L_layers=2 \
arch.H_cycles=3 arch.L_cycles=4 \
+run_name=${run_name} ema=True

ARC-AGI-2

run_name="trm_arc_v2_public"
torchrun --nproc-per-node 8 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 pretrain.py \
arch=trm \
data_paths="[data/arc2concept-aug-1000]" \
arch.L_layers=2 \
arch.H_cycles=3 arch.L_cycles=4 \
+run_name=${run_name} ema=True

For multi-node training:

export MAIN_ADDR=<MAIN_IP>
export MAIN_PORT=29500
export NNODES=2
export GPUS_PER_NODE=8          
export OMP_NUM_THREADS=8        
export NCCL_PORT_RANGE=40000-40050
run_name="arc_v1_public_2_nodes"
# on each node:
export NODE_RANK=0
torchrun \
  --nnodes $NNODES \
  --node_rank $NODE_RANK \
  --nproc_per_node $GPUS_PER_NODE \
  --rdzv_backend c10d \
  --rdzv_endpoint $MAIN_ADDR:$MAIN_PORT \
  pretrain.py \
arch=trm \
data_paths="[data/arc1concept-aug-1000]" \
arch.L_layers=2 \
arch.H_cycles=3 arch.L_cycles=4 \
+run_name=${run_name} ema=True \
eval_interval=50000

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

arcprize
/

trm_arc_prize_verification

Replication Results

Environment Setup

Dataset preprocessing

ARC-AGI-1

ARC-AGI-2

Training

ARC-AGI-2

ARC-AGI-2

For multi-node training:

Datasets used to train arcprize/trm_arc_prize_verification