Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning

Orion-MSP is a tabular foundation model for in-context learning. It uses multi-scale sparse attention and Perceiver-style memory to process tabular data at multiple granularities, capturing both local feature interactions and global dataset-level patterns.

Key Features

Multi-Scale Sparse Attention: Processes features at three levels (scales 1, 4, 16) using windowed, global, and random attention patterns, reducing quadratic complexity to near-linear.
Hierarchical Feature Understanding: Captures patterns from individual cells to feature groups through scale-aware attention.
Perceiver-Style Memory: Cross-component memory that compresses dataset information for efficient processing across samples
Memory-Efficient: Block-sparse masking enables efficient processing of large tabular datasets
Scikit-learn Compatible: Drop-in replacement with .fit() and .predict() methods

Architecture

Orion-MSP consists of four main components:

Column-wise Embedding: Distribution-aware feature embeddings using Induced Set Attention Blocks (ISAB)
Multi-Scale Row Interaction: Sparse attention with windowed, global, and random patterns across multiple scales
Cross-Component Memory: Perceiver-style memory for efficient dataset-level context
Dataset-wise ICL: Enhanced predictor leveraging enriched representations for few-shot tabular classification

Performance

Performance comparison across three benchmark suites—TALENT, OpenML-CC18, and TabZilla. Ranks are mean ranks based on accuracy (lower is better). Metrics: ACC = Accuracy, F1 = Weighted F1. 1st; 2nd.
Models	All	TALENT			OpenML-CC18			TabZilla
Models	Rank	Rank	ACC	F1	Rank	ACC	F1	Rank	ACC	F1
XGBoost	6.70	6.02	0.8403	0.8360	5.89	0.8558	0.8537	6.07	0.8612	0.8326
CatBoost	6.43	5.57	0.8336	0.8259	6.25	0.8588	0.8520	7.13	0.8579	0.8384
Random Forest	7.38	6.15	0.8285	0.8209	6.36	0.8547	0.8497	8.42	0.8358	0.8399
LightGBM	6.78	6.11	0.8331	0.8245	6.18	0.8581	0.8493	5.25	0.8618	0.8211
TabICL	4.96	4.09	0.8471	0.8379	4.69	0.8667	0.8623	5.89	0.8734	0.8698
OrionBiX	5.37	4.59	0.8346	0.8260	4.98	0.8653	0.8596	4.89	0.8728	0.8628
OrionMSP	3.58	3.26	0.8461	0.8360	4.12	0.8722	0.8676	3.84	0.8821	0.8786
TabPFN	4.61	3.72	0.8514	0.8412	4.76	0.8714	0.8663	4.86	0.8752	0.8716
Mitra	11.77	10.38	0.3921	0.2868	10.52	0.3614	0.2522	11.21	0.3152	0.1830
ContextTab	9.70	9.84	0.5474	0.4596	6.28	0.8639	0.8581	7.13	0.8389	0.8334
TabDPT	5.42	5.19	0.8408	0.8318	4.64	0.8672	0.8625	3.94	0.8814	0.8775

Orion-MSP is the most consistent top performer across all three benchmarks, achieving the best overall rank.

On TALENT, it ranks *1 overall, while TabPFN edges the highest ACC/F1 by a hair.
On OpenML-CC18, Orion-MSP attains the top ACC/F1 (0.8722/0.8676), narrowly ahead of TabPFN and TabDPT.
On TabZilla, it leads with the highest ACC/F1 and the best rank.
Classical baselines (XGBoost/LightGBM/CatBoost/RF) trail noticeably, highlighting Orion-MSP’s robustness across diverse tabular tasks.

Performance variation by dataset size across all benchmark suites. Rank = mean rank by accuracy (lower is better). ACC = Accuracy; F1 = Weighted F1. Size buckets: Small (<1K), Medium (1K–10K), Large (>10K).
Models	Small (<1K)			Medium (1K–10K)			Large (>10K)
Models	Rank	ACC	F1	Rank	ACC	F1	Rank	ACC	F1
XGBoost	7.70	0.8168	0.7964	6.88	0.8363	0.8314	5.41	0.8969	0.8920
CatBoost	7.88	0.8124	0.7935	6.47	0.8340	0.8264	5.48	0.8797	0.8733
Random Forest	8.55	0.7988	0.8187	7.16	0.8285	0.8221	7.30	0.8694	0.8628
LightGBM	7.80	0.8143	0.7789	6.94	0.8314	0.8226	5.63	0.8827	0.8764
TabICL	6.04	0.8301	0.8338	4.77	0.8486	0.8398	4.61	0.8802	0.8743
OrionBiX	6.32	0.8330	0.8150	5.48	0.8348	0.8260	4.42	0.8729	0.8670
OrionMSP	5.93	0.8232	0.8194	3.70	0.8494	0.8402	3.04	0.8843	0.8768
TabPFN	6.50	0.8325	0.8131	3.81	0.8557	0.8462	5.73	0.8783	0.8713
Mitra	13.88	0.4334	0.3236	11.59	0.3600	0.2553	11.11	0.3837	0.2754
ContextTab	9.60	0.7578	0.7363	9.52	0.6210	0.5566	10.22	0.6388	0.5638
TabDPT	5.48	0.8333	0.8271	5.40	0.8424	0.8339	5.26	0.8831	0.8765

OrionMSP is the most consistent top-ranked model as data grows (especially Medium/Large), while TabPFN peaks on Medium and GBDTs (e.g., XGBoost) catch up in raw ACC/F1 on Large.

Performance vs. feature dimensionality. Rank = mean accuracy rank (lower is better). ACC = Accuracy; F1 = Weighted F1. Groups: Narrow (<10), Medium (10–100), Wide (>100). 1st ; 2nd within each group.
Models	Narrow (<10)			Medium (10–100)			Wide (>100)
Models	Rank	ACC	F1	Rank	ACC	F1	Rank	ACC	F1
XGBoost	6.77	0.8222	0.8159	6.90	0.8482	0.8410	4.79	0.9140	0.9039
CatBoost	5.63	0.8145	0.8067	6.88	0.8441	0.8344	5.50	0.9157	0.9084
Random Forest	7.15	0.8005	0.7044	7.44	0.8410	0.8235	7.52	0.9034	0.8936
LightGBM	6.15	0.8128	0.7907	6.92	0.8458	0.8326	7.47	0.8999	0.8908
TabICL	5.14	0.8208	0.8119	4.61	0.8627	0.8549	6.46	0.9101	0.8936
OrionBiX	4.64	0.8112	0.8043	5.46	0.8510	0.8417	6.73	0.8859	0.8849
OrionMSP	3.76	0.8394	0.8314	4.09	0.8572	0.8478	5.69	0.8860	0.8837
TabPFN	5.30	0.8187	0.8092	4.07	0.8676	0.8589	6.141	0.9129	0.9111
Mitra	11.25	0.3737	0.2683	11.84	0.3886	0.2781	13.03	0.2521	0.1497
ContextTab	9.52	0.6391	0.5719	9.59	0.6480	0.5843	10.97	0.6017	0.5651
TabDPT	4.66	0.8262	0.8189	5.45	0.8566	0.8483	7.23	0.8845	0.8820

OrionMSP excels on narrow and stays strong on medium width, while TabPFN dominates medium-width features and GBDTs (XGBoost/CatBoost) shine on wide feature spaces.

Usage

from orion_msp.sklearn import OrionMSPClassifier

# Initialize and use
clf = OrionMSPClassifier()
clf.fit(X_train, y_train)
predictions = clf.predict(X_test)

This code will automatically download the pre-trained model from Hugging Face and use a GPU if available.

Installation

From the source

Option 1: From the local clone

cd orion-msp
pip install -e .

Option 2: From the Git Remote

pip install git+https://github.com/Lexsi-Labs/Orion-MSP.git

Citation

If you use Orion-MSP, please cite our paper:

@article{bouadi25orionmsp,
  title={Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning},
  author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
  year={2025}
  eprint={2511.02818},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2511.02818}, 
}

Downloads last month: 11

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including Lexsi/Orion-MSP

Orion Tabular Foundation Models

Collection

2 items • Updated 3 days ago