ChessFormer-SL / README.md

kaupane

Update README.md

cf347c8 verified 5 months ago

preview code

raw

history blame contribute delete

4.74 kB

metadata

license: mit
tags:
  - chess
  - transformer
  - reinforcement-learning
  - game-playing
library_name: pytorch

ChessFormer-SL

ChessFormer-SL is a transformer-based chess model trained via supervised learning on Stockfish evaluations. This model explores training chess engines without Monte Carlo Tree Search (MCTS), using only neural networks.

Model Description

Model type: Transformer for chess position evaluation and move prediction
Language(s): Chess (FEN notation)
License: MIT
Parameters: 100.7M

Architecture

ChessFormer uses a custom transformer architecture optimized for chess:

Blocks: 20 transformer layers
Hidden size: 640
Attention heads: 8
Intermediate size: 1728
Features: RMSNorm, SwiGLU activation, custom FEN tokenizer

Input Format

The model processes FEN strings and repetition counts, tokenizing them into 75-token sequences representing:

64 board square tokens (pieces + positional embeddings)
9 metadata tokens (turn, castling, en passant, clocks, repetitions)
2 special tokens (action, value)

Output Format

Policy head: Logits over 1,969 structurally valid chess moves
Value head: Position evaluation from current player's perspective

Training Details

Training Data

Dataset: kaupane/lichess-2023-01-stockfish-annotated (depth18 split)
Size: 56M positions with Stockfish evaluations
Validation: depth27 split

Training Procedure

Method: Supervised learning on Stockfish move recommendations and evaluations
Objective: Cross-entropy loss (moves) + MSE loss (values) + invalid move penalty
Hardware: RTX 4060Ti 16GB
Duration: ~2 weeks
Checkpoints: 20 total, this model is the final checkpoint

Training Metrics

Action Loss: 1.6985
Value Loss: 0.0407
Invalid Loss: 0.0303

Performance

Capabilities

✅ Reasonable opening and endgame play
✅ Fast inference without search
✅ Better than next-token prediction chess models
✅ Can defeat Stockfish occasionally with search enhancement

Limitations

❌ Frequent tactical blunders in midgame
❌ Estimated ELO ~1500 (informal assessment)
❌ Struggles with complex tactical combinations
❌ Tends to give away pieces ("free captures")

Usage

Installation

pip install torch transformers huggingface_hub chess
# Download model.py from this repository

Basic Usage

import torch
from model import ChessFormerModel

# Load model
model = ChessFormerModel.from_pretrained("kaupane/ChessFormer-SL")
model.eval()

# Analyze position
fens = ["rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1"]
repetitions = torch.tensor([1])

with torch.no_grad():
    move_logits, position_value = model(fens, repetitions)
    
# Get best move (requires additional processing for legal moves)
print(f"Position value: {position_value.item():.3f}")

With Chess Engine Interface

from engine import Engine, ChessformerConfig
import chess

# Create engine
config = ChessformerConfig(
    chessformer=model,
    temperature=0.5,
    depth=2  # Enable search enhancement
)
engine = Engine(type="chessformer", chessformer_config=config)

# Play move
board = chess.Board()
move_uci, value = engine.move(board)
print(f"Suggested move: {move_uci}, Value: {value:.3f}")

Limitations and Bias

Technical Limitations

Tactical weakness: Prone to hanging pieces and missing simple tactics
Computational inefficiency: FEN tokenization creates training bottlenecks, preprocess the entire dataset before training should be benefical

Potential Biases

Trained exclusively on Stockfish evaluations, may inherit engine biases
May not generalize to unconventional openings or endgames

Known Issues

Piece embeddings have consistently lower norms than positional embeddings
Model sometimes assigns probability (though unlikely, ~3%) to invalid moves despite training penalty
Performance degrades without search enhancement

Ethical Considerations

This model is intended for:

✅ Educational purposes and chess learning
✅ Research into neural chess architectures
✅ Developing chess training tools

Not recommended for:

❌ Competitive chess tournaments
❌ Production chess engines without extensive testing
❌ Applications requiring reliable tactical calculation

Additional Information

Repository: GitHub link
Demo: HuggingFace Space Demo
Related: ChessFormer-RL (RL training experiment)