CatkinChen
/

nethack-vae-hmm

Feature Extraction

MultiModalHackVAE_with_StickyHDPHMM

reinforcement-learning

variational-autoencoder

representation-learning

Model card Files Files and versions

nethack-vae-hmm / README.md

CatkinChen's picture

Add model card

4ba9d36 verified 3 months ago

|

2.06 kB

	---
	license: mit
	language: en
	tags:
	- nethack
	- reinforcement-learning
	- variational-autoencoder
	- representation-learning
	- multimodal
	- world-modeling
	pipeline_tag: feature-extraction
	---

	# MultiModalHackVAE

	A multi-modal Variational Autoencoder trained on NetHack game states for representation learning.

	## Model Description

	This model is a MultiModalHackVAE that learns compact representations of NetHack game states by processing:
	- Game character grids (21x79)
	- Color information
	- Game statistics (blstats)
	- Message text
	- Bag of glyphs
	- Hero information (role, race, gender, alignment)

	## Model Details

	- Model Type: Multi-modal Variational Autoencoder
	- Framework: PyTorch
	- Dataset: NetHack Learning Dataset
	- Latent Dimensions: 96
	- Low-rank Dimensions: 0

	## Usage

	```python
	from train import load_model_from_huggingface
	import torch

	# Load the model
	model, config = load_model_from_huggingface("CatkinChen/nethack-vae-hmm")

	# Example usage with synthetic data
	batch_size = 1
	game_chars = torch.randint(32, 127, (batch_size, 21, 79))
	game_colors = torch.randint(0, 16, (batch_size, 21, 79))
	blstats = torch.randn(batch_size, 27)
	msg_tokens = torch.randint(0, 128, (batch_size, 256))
	hero_info = torch.randint(0, 10, (batch_size, 4))

	with torch.no_grad():
	output = model(
	glyph_chars=game_chars,
	glyph_colors=game_colors,
	blstats=blstats,
	msg_tokens=msg_tokens,
	hero_info=hero_info
	)
	latent_mean = output['mu']
	latent_logvar = output['logvar']
	lowrank_factors = output['lowrank_factors']
	```

	## Training

	This model was trained using adaptive loss weighting with:
	- Embedding warm-up for quick convergence
	- Gradual raw reconstruction focus
	- KL beta annealing for better latent structure

	## Citation

	If you use this model, please consider citing:

	```bibtex
	@misc{nethack-vae,
	title={MultiModalHackVAE: Multi-modal Variational Autoencoder for NetHack},
	author={Xu Chen},
	year={2025},
	url={https://huggingface.co/CatkinChen/nethack-vae-hmm}
	}
	```