[NeurIPS'25] Bootstrapping Hierarchical Autoregressive Formal Reasoner with Chain-of-Proxy-Autoformalization

License: Apache 2.0 Lean 4 GitHub

Qi Liu, Xinhao Zheng, Renqiu Xia, Qinxiang Cao, Junchi Yan* (* indicates corresponding author)

School of Computer Science & School of Artificial Intelligence, Shanghai Jiao Tong University

Shanghai Innovation Institute

Please refer to the πŸ“ΊGitHub repo and πŸ“ƒPaper for more details.

πŸ“ˆ Performance

Cycle Method FormalMath500 MiniF2F-Solving
Solved% ↑ Budget ↓ Solved % ↑ Budget ↓
1 BFS 9.52% Β± 0.57% 28139 Β± 104 9.64% Β± 1.66% 27712 Β± 339
1 WG 18.78% Β± 0.22% 18456 Β± 92 24.95% Β± 1.09% 18853 Β± 454
1 WG $(K_W=16)$ 21.78% Β± 0.12% 35391 Β± 153 28.83% Β± 0.77% 35895 Β± 487
1 AR 34.39% Β± 0.78% 20860 Β± 233 44.41% Β± 0.34% 17814 Β± 140
1 H-BFS 13.32% Β± 1.47% 27777 Β± 360 12.25% Β± 1.47% 27479 Β± 203
1 H-WG (This Model) 36.77% Β± 0.86% 17039 Β± 461 47.30% Β± 1.01% 16303 Β± 388
1 H-WG $(K_W=16)$ (This Model) 40.04% Β± 0.50% 32134 Β± 226 51.08% Β± 0.44% 30357 Β± 212
1 HAR 43.65% Β± 1.35% 18432 Β± 344 55.50% Β± 0.46% 15069 Β± 143
2 HAR 44.09% Β± 0.54% 18273 Β± 116 56.58% Β± 0.92% 14754 Β± 269

βš™οΈ Usage

We recommend using vLLM to serve the model locally.

This model is SFTed for the following tasks with πŸ€—purewhite42/CoPA_Dataset.

  • Next-Proof-Step Prediction (next_proof_step_prediction): Input the current proof state; the model outputs the next proof step.
  • Whole-Solution Drafting (whole_solution_drafting): Input an informal problem and its initial solution state; the model outputs a formal solution draft.

πŸ“‹ Example

Next-Proof-Step Prediction

System Prompt

You are a Lean 4 expert.

User Prompt

Generate a tactic that can transform one step from the current tactic state to the 'no goals' tactic state.
Current tactic state:
```
R : Type u
ΞΉ : Type v
inst✝⁢ : CommRing R
L : Type w₁
M : ΞΉ β†’ Type w
inst✝⁡ : LieRing L
inst✝⁴ : LieAlgebra R L
inst✝³ : (i : ΞΉ) β†’ AddCommGroup (M i)
inst✝² : (i : ΞΉ) β†’ Module R (M i)
inst✝¹ : (i : ΞΉ) β†’ LieRingModule L (M i)
inst✝ : βˆ€ (i : ΞΉ), LieModule R L (M i)
x y : L
m : ⨁ (i : ι), M i
⊒ ⁅x + y, m⁆ = ⁅x, m⁆ + ⁅y, m⁆
```

Response

refine
  DFinsupp.ext fun _ =>
    ?_
      -- Porting note (https://github.com/leanprover-community/mathlib4/issues/11041): Originally `ext`

Whole-Solution Drafting

System Prompt

You are a Lean 4 expert.

User Prompt

Given a natural language math problem and the initial solution state, please generate a Lean 4 solution sketch.
You can use Lean 4 comments to conduct natural language reasoning.
Please only use forward reasoning; do not use tactics that modify the final goal.
For new hypotheses, please do not prove them and use `sorry` to close them.
Please assume the following header code has already been executed, and do not add any imports or openings.
```lean4
import Mathlib
```

# Problem
"""
Given $2\sin (Ο€-x)+1=0$, find the value of $\cos 2x$.
"""

# Initial Solution State
```
case h.mp
answer x : ℝ
h_eq : 2 * Real.sin (Real.pi - x) + 1 = 0
h_answer : Real.cos (2 * x) = answer
⊒ ?w
```

Response

# Formal Solution Draft
```lean4
-- The given equation can be simplified as follows: $2\sin (Ο€-x)+1=0$
have h_given_eq : 2 * Real.sin (Real.pi - x) + 1 = 0 := h_eq

-- Using the cofunction identity $\sin(Ο€-x)=\sin(x)$, we have: $2\sin(x)+1=0$
have h_cofunction : Real.sin (Real.pi - x) = Real.sin x := sorry
have h_simplified_eq : 2 * Real.sin x + 1 = 0 := sorry

-- Now, solve for $\sin(x)$: $\sin(x)=-\frac{1}{2}$
have h_sin_x : Real.sin x = -1/2 := sorry

-- Now, we need to find $\cos 2x$. We can use the double angle identity for cosine: $\cos 2x = 1 - 2\sin^2 x$
have h_double_angle : Real.cos (2 * x) = 1 - 2 * (Real.sin x)^2 := sorry

-- Substitute the value of $\sin x$: $\cos 2x = 1 - 2(-\frac{1}{2})^2$
have h_substituted : Real.cos (2 * x) = 1 - 2 * (-1/2)^2 := sorry

-- Simplify: $\cos 2x = 1 - 2(\frac{1}{4})$
have h_simplified : Real.cos (2 * x) = 1 - 2 * (1/4) := sorry

-- $\cos 2x = 1 - \frac{1}{2}$
have h_final_simplification : Real.cos (2 * x) = 1 - 1/2 := sorry

-- $\cos 2x = \frac{1}{2}$
have h_cos_2x : Real.cos (2 * x) = 1/2 := sorry

-- Thus, the value of $\cos 2x$ is $\boxed{\frac{1}{2}}$
have h_answer : answer = 1/2 := sorry

exact h_answer -- submit

```

πŸ“š Citation

If you find our work useful in your research, please cite

@inproceedings{
liu2025bootstrapping,
title={Bootstrapping Hierarchical Autoregressive Formal Reasoner with Chain-of-Proxy-Autoformalization},
author={Liu, Qi and Zheng, Xinhao and Xia, Renqiu and Cao, Qinxiang and Yan, Junchi},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=2Xn8h68mP3}
}

©️ License

This project is released under the Apache 2.0 license. Please see the LICENSE file for more information.

βœ‰οΈ Contact

Feel free to discuss the paper/data/code with us through issues/emails!

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for purewhite42/HWG_CoPA_Cycle1

Base model

Qwen/Qwen2.5-7B
Finetuned
(939)
this model

Datasets used to train purewhite42/HWG_CoPA_Cycle1

Collection including purewhite42/HWG_CoPA_Cycle1