Genos

Genos, as a foundational model in the field of human genomics, trained on hundreds of high-quality genome reference data, has achieved the ability to contextually model human genome sequences up to millions of base pairs. Through single-base resolution learning, this model possesses the capability to identify hidden deep sequence patterns and functional features within genomes, providing scientists with a new research method that connects genetic information with life activities.

For instructions, details, and examples, please refer to the Genos GitHub.

Below are the data volume of our model training and related parameters.

Model Specification	Genos 1.2B	Genos 10B
Model Scale
Total Parameters	1.2B	10B
Activated Parameters	0.33B	2.87B
Trained Tokens	1600 B	2200 B
Architecture
Architecture Type	MoE	MoE
Number of Experts	8	8
Selected Experts per Token	2	2
Number of Layers	12	12
Attention Hidden Dimension	1024	4096
Number of Attention Heads	16	16
MoE Hidden Dimension (per Expert)	4096	8192
Vocabulary Size	128 (padded)	256 (padded)
Context Length	up to 1M	up to 1M

Genos 1.2B and 10B checkpoints are available here:

We also provide checkpoints trained under the Megatron-LM framework:

Downloads last month: 105

Safetensors

Model size

1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including BGI-HangzhouAI/Genos-1.2B

Genos

Collection

2 items • Updated Oct 12 • 1