---
library_name: transformers
tags:
- hrm
- hierarchical-reasoning
- maze
- pathfinding
- reasoning
- navigation
license: apache-2.0
---

# HRM Maze 30x30 Hard

A Hierarchical Reasoning Model (HRM) trained to solve hard 30×30 maze navigation problems using hierarchical processing and adaptive computation.

## Model Details

### Model Description

This is a Hierarchical Reasoning Model checkpoint fine-tuned specifically for solving hard maze pathfinding problems on 30×30 grids. The model employs a two-level hierarchical architecture inspired by human cognition, with high-level (H) modules for abstract route planning and low-level (L) modules for detailed navigation decisions. It uses Adaptive Computation Time (ACT) with Q-learning based halting to dynamically allocate computational resources.

The model processes maze grids up to 30×30 (900 tokens) and predicts optimal navigation paths through complex maze environments.

- **Developed by:** Sapient Inc.
- **Model type:** Hierarchical Reasoning Model (HRM)
- **Language(s):** Symbolic reasoning (maze navigation symbols)
- **License:** Apache 2.0
- **Original checkpoint:** [sapientinc/HRM-checkpoint-maze-30x30-hard](https://huggingface.co/sapientinc/HRM-checkpoint-maze-30x30-hard)

### Model Sources

- **Repository:** [transformers](https://github.com/huggingface/transformers)
- **Paper:** [Hierarchical Reasoning Model](https://arxiv.org/abs/2506.21734)
- **Original Repository:** [HRM GitHub](https://github.com/sapientinc/HRM)

## Uses

### Direct Use

This model is designed for solving hard maze navigation problems. It can:

- Find optimal paths through complex 30×30 maze environments
- Navigate mazes with multiple obstacles and dead ends
- Process partial maze representations and predict navigation sequences
- Demonstrate hierarchical planning strategies for spatial reasoning tasks

### Downstream Use

The model can be used as:

- A component in game AI and procedural content generation
- A baseline for research in hierarchical spatial reasoning
- An example of applying neural networks to pathfinding and navigation problems
- A planning module in robotics and autonomous navigation research

### Recommendations

Users should be aware that:
- The model is specialized for maze pathfinding and should not be used for general spatial reasoning tasks
- Input must be properly formatted as grid representations with the 6-token vocabulary
- Inference time may vary due to the adaptive computation mechanism
- The model is optimized for hard difficulty mazes and may be over-engineered for simple mazes

## How to Get Started with the Model

```python
import torch
from transformers import HrmForCausalLM

# Load the model
model = HrmForCausalLM.from_pretrained("zbloss/HRM-maze-30x30-hard")
model.eval()

# Prepare a maze grid (e.g., 20x20 = 400 tokens)
# Vocabulary: 0-5 representing different maze elements
# (e.g., 0=empty, 1=wall, 2=start, 3=goal, 4=path, 5=visited)
maze_grid = torch.randint(0, 6, (1, 400))  # Example 20x20 maze
puzzle_ids = torch.zeros(1, dtype=torch.long)

# Run inference
with torch.no_grad():
    outputs = model(input_ids=maze_grid, puzzle_identifiers=puzzle_ids)

# Get predictions
predictions = torch.argmax(outputs.logits, dim=-1)
print(f"Predicted navigation path: {predictions}")
print(f"Q-halt: {outputs.q_halt_logits[0]:.4f}")
print(f"Q-continue: {outputs.q_continue_logits[0]:.4f}")
```

## Training Details

### Training Data

The model was trained on a dataset of hard difficulty 30×30 maze environments. These mazes feature:
- Complex layouts with multiple branching paths
- Dead ends requiring backtracking
- Long optimal paths requiring multi-step planning
- Variable start and goal positions

### Training Procedure

The model uses a hierarchical architecture with:
- **High-level (H) module:** 4 transformer layers for abstract route planning
- **Low-level (L) module:** 4 transformer layers for detailed navigation decisions
- **H-cycles:** 2 high-level reasoning cycles for strategic planning
- **L-cycles:** 2 low-level computation cycles per H-cycle for tactical moves
- **ACT mechanism:** Q-learning based adaptive halting with max 16 steps

#### Training Hyperparameters

- **Training regime:** bfloat16 mixed precision
- **Architecture:** 4 H-layers, 4 L-layers, 8 attention heads
- **Hidden size:** 512
- **Intermediate size:** 1536
- **Max position embeddings:** 900 (supports up to 30×30 grids)
- **Vocabulary size:** 6 (maze navigation symbols)

## Model Architecture

### Technical Specifications

| Component | Value |
|-----------|-------|
| **Total Parameters** | 27,270,658 (27.3M) |
| **Model Size** | 109.09 MB |
| **Vocabulary Size** | 6 |
| **Hidden Size** | 512 |
| **Intermediate Size** | 1536 |
| **H-level Layers** | 4 |
| **L-level Layers** | 4 |
| **Attention Heads** | 8 |
| **H-cycles** | 2 |
| **L-cycles** | 2 |
| **Max Halting Steps** | 16 |
| **Max Grid Size** | 30×30 (900 tokens) |
| **Position Encoding** | RoPE (Rotary Position Embeddings) |
| **Activation** | SwiGLU |

### Model Architecture and Objective

The Hierarchical Reasoning Model (HRM) features:

1. **Two-level Hierarchical Processing:**
   - **H-level (High-level):** Performs slow, abstract route planning and strategic navigation
   - **L-level (Low-level):** Executes fast, detailed navigation decisions and obstacle avoidance

2. **Adaptive Computation Time (ACT):**
   - Q-learning based halting mechanism
   - Dynamically determines when sufficient computation has been performed
   - Allows variable computational depth based on maze complexity
   - More complex mazes with longer paths trigger more reasoning cycles

3. **Recurrent Carry State:**
   - Maintains H and L hidden states across reasoning cycles
   - Enables iterative refinement of navigation strategies
   - Supports backtracking and path correction

4. **Positional Encoding:**
   - RoPE (Rotary Position Embeddings) for position-aware attention
   - Critical for spatial reasoning in grid-based environments
   - Supports up to 900 positions (30×30 grids)

### Compute Infrastructure

#### Software

- **Framework:** PyTorch with transformers library
- **Precision:** bfloat16
- **Format:** Safetensors

## Performance

The model is designed to solve hard difficulty mazes on 30×30 grids, demonstrating:
- Multi-step planning capabilities for long navigation sequences
- Ability to recognize and avoid dead ends
- Strategic backtracking when necessary
- Hierarchical decomposition of complex navigation problems

## Citation

**BibTeX:**

```bibtex
@article{wang2025hierarchical,
  title={Hierarchical Reasoning Model},
  author={Wang, Guan and Li, Jin and Sun, Yuhao and Chen, Xing and Liu, Changling and Wu, Yue and Lu, Meng and Song, Sen and Yadkori, Yasin Abbasi},
  journal={arXiv preprint arXiv:2506.21734},
  year={2025}
}
```

**APA:**

Wang, G., Li, J., Sun, Y., Chen, X., Liu, C., Wu, Y., Lu, M., Song, S., & Yadkori, Y. A. (2025). Hierarchical Reasoning Model. arXiv preprint arXiv:2506.21734.

## More Information

This checkpoint is a converted version of the original HRM checkpoint from [sapientinc/HRM-checkpoint-maze-30x30-hard](https://huggingface.co/sapientinc/HRM-checkpoint-maze-30x30-hard), formatted for use with the HuggingFace transformers library.

For more details about the HRM architecture and training methodology, see:
- **Paper:** https://arxiv.org/abs/2506.21734
- **Original Implementation:** https://github.com/sapientinc/HRM

### Example Use Cases

1. **Game AI:** Intelligent maze navigation in video games
2. **Path Planning Research:** Baseline for hierarchical planning algorithms
3. **Robotics:** Inspiration for hierarchical navigation strategies
4. **Education:** Demonstrating neural approaches to classic AI problems

## Model Card Contact

For questions or issues with this converted checkpoint, please open an issue in the [transformers repository](https://github.com/huggingface/transformers).