--- library_name: transformers tags: - hrm - hierarchical-reasoning - maze - pathfinding - reasoning - navigation license: apache-2.0 --- # HRM Maze 30x30 Hard A Hierarchical Reasoning Model (HRM) trained to solve hard 30×30 maze navigation problems using hierarchical processing and adaptive computation. ## Model Details ### Model Description This is a Hierarchical Reasoning Model checkpoint fine-tuned specifically for solving hard maze pathfinding problems on 30×30 grids. The model employs a two-level hierarchical architecture inspired by human cognition, with high-level (H) modules for abstract route planning and low-level (L) modules for detailed navigation decisions. It uses Adaptive Computation Time (ACT) with Q-learning based halting to dynamically allocate computational resources. The model processes maze grids up to 30×30 (900 tokens) and predicts optimal navigation paths through complex maze environments. - **Developed by:** Sapient Inc. - **Model type:** Hierarchical Reasoning Model (HRM) - **Language(s):** Symbolic reasoning (maze navigation symbols) - **License:** Apache 2.0 - **Original checkpoint:** [sapientinc/HRM-checkpoint-maze-30x30-hard](https://huggingface.co/sapientinc/HRM-checkpoint-maze-30x30-hard) ### Model Sources - **Repository:** [transformers](https://github.com/huggingface/transformers) - **Paper:** [Hierarchical Reasoning Model](https://arxiv.org/abs/2506.21734) - **Original Repository:** [HRM GitHub](https://github.com/sapientinc/HRM) ## Uses ### Direct Use This model is designed for solving hard maze navigation problems. It can: - Find optimal paths through complex 30×30 maze environments - Navigate mazes with multiple obstacles and dead ends - Process partial maze representations and predict navigation sequences - Demonstrate hierarchical planning strategies for spatial reasoning tasks ### Downstream Use The model can be used as: - A component in game AI and procedural content generation - A baseline for research in hierarchical spatial reasoning - An example of applying neural networks to pathfinding and navigation problems - A planning module in robotics and autonomous navigation research ### Recommendations Users should be aware that: - The model is specialized for maze pathfinding and should not be used for general spatial reasoning tasks - Input must be properly formatted as grid representations with the 6-token vocabulary - Inference time may vary due to the adaptive computation mechanism - The model is optimized for hard difficulty mazes and may be over-engineered for simple mazes ## How to Get Started with the Model ```python import torch from transformers import HrmForCausalLM # Load the model model = HrmForCausalLM.from_pretrained("zbloss/HRM-maze-30x30-hard") model.eval() # Prepare a maze grid (e.g., 20x20 = 400 tokens) # Vocabulary: 0-5 representing different maze elements # (e.g., 0=empty, 1=wall, 2=start, 3=goal, 4=path, 5=visited) maze_grid = torch.randint(0, 6, (1, 400)) # Example 20x20 maze puzzle_ids = torch.zeros(1, dtype=torch.long) # Run inference with torch.no_grad(): outputs = model(input_ids=maze_grid, puzzle_identifiers=puzzle_ids) # Get predictions predictions = torch.argmax(outputs.logits, dim=-1) print(f"Predicted navigation path: {predictions}") print(f"Q-halt: {outputs.q_halt_logits[0]:.4f}") print(f"Q-continue: {outputs.q_continue_logits[0]:.4f}") ``` ## Training Details ### Training Data The model was trained on a dataset of hard difficulty 30×30 maze environments. These mazes feature: - Complex layouts with multiple branching paths - Dead ends requiring backtracking - Long optimal paths requiring multi-step planning - Variable start and goal positions ### Training Procedure The model uses a hierarchical architecture with: - **High-level (H) module:** 4 transformer layers for abstract route planning - **Low-level (L) module:** 4 transformer layers for detailed navigation decisions - **H-cycles:** 2 high-level reasoning cycles for strategic planning - **L-cycles:** 2 low-level computation cycles per H-cycle for tactical moves - **ACT mechanism:** Q-learning based adaptive halting with max 16 steps #### Training Hyperparameters - **Training regime:** bfloat16 mixed precision - **Architecture:** 4 H-layers, 4 L-layers, 8 attention heads - **Hidden size:** 512 - **Intermediate size:** 1536 - **Max position embeddings:** 900 (supports up to 30×30 grids) - **Vocabulary size:** 6 (maze navigation symbols) ## Model Architecture ### Technical Specifications | Component | Value | |-----------|-------| | **Total Parameters** | 27,270,658 (27.3M) | | **Model Size** | 109.09 MB | | **Vocabulary Size** | 6 | | **Hidden Size** | 512 | | **Intermediate Size** | 1536 | | **H-level Layers** | 4 | | **L-level Layers** | 4 | | **Attention Heads** | 8 | | **H-cycles** | 2 | | **L-cycles** | 2 | | **Max Halting Steps** | 16 | | **Max Grid Size** | 30×30 (900 tokens) | | **Position Encoding** | RoPE (Rotary Position Embeddings) | | **Activation** | SwiGLU | ### Model Architecture and Objective The Hierarchical Reasoning Model (HRM) features: 1. **Two-level Hierarchical Processing:** - **H-level (High-level):** Performs slow, abstract route planning and strategic navigation - **L-level (Low-level):** Executes fast, detailed navigation decisions and obstacle avoidance 2. **Adaptive Computation Time (ACT):** - Q-learning based halting mechanism - Dynamically determines when sufficient computation has been performed - Allows variable computational depth based on maze complexity - More complex mazes with longer paths trigger more reasoning cycles 3. **Recurrent Carry State:** - Maintains H and L hidden states across reasoning cycles - Enables iterative refinement of navigation strategies - Supports backtracking and path correction 4. **Positional Encoding:** - RoPE (Rotary Position Embeddings) for position-aware attention - Critical for spatial reasoning in grid-based environments - Supports up to 900 positions (30×30 grids) ### Compute Infrastructure #### Software - **Framework:** PyTorch with transformers library - **Precision:** bfloat16 - **Format:** Safetensors ## Performance The model is designed to solve hard difficulty mazes on 30×30 grids, demonstrating: - Multi-step planning capabilities for long navigation sequences - Ability to recognize and avoid dead ends - Strategic backtracking when necessary - Hierarchical decomposition of complex navigation problems ## Citation **BibTeX:** ```bibtex @article{wang2025hierarchical, title={Hierarchical Reasoning Model}, author={Wang, Guan and Li, Jin and Sun, Yuhao and Chen, Xing and Liu, Changling and Wu, Yue and Lu, Meng and Song, Sen and Yadkori, Yasin Abbasi}, journal={arXiv preprint arXiv:2506.21734}, year={2025} } ``` **APA:** Wang, G., Li, J., Sun, Y., Chen, X., Liu, C., Wu, Y., Lu, M., Song, S., & Yadkori, Y. A. (2025). Hierarchical Reasoning Model. arXiv preprint arXiv:2506.21734. ## More Information This checkpoint is a converted version of the original HRM checkpoint from [sapientinc/HRM-checkpoint-maze-30x30-hard](https://huggingface.co/sapientinc/HRM-checkpoint-maze-30x30-hard), formatted for use with the HuggingFace transformers library. For more details about the HRM architecture and training methodology, see: - **Paper:** https://arxiv.org/abs/2506.21734 - **Original Implementation:** https://github.com/sapientinc/HRM ### Example Use Cases 1. **Game AI:** Intelligent maze navigation in video games 2. **Path Planning Research:** Baseline for hierarchical planning algorithms 3. **Robotics:** Inspiration for hierarchical navigation strategies 4. **Education:** Demonstrating neural approaches to classic AI problems ## Model Card Contact For questions or issues with this converted checkpoint, please open an issue in the [transformers repository](https://github.com/huggingface/transformers).