Import from tplr/Covenant72B@Checkpoint-Two into main

497725c verified about 2 months ago

2.04 kB

	---
	license: apache-2.0
	datasets:
	- mlfoundations/dclm-baseline-1.0-parquet
	---

	# Covenant72B

	Covenant72B is the largest permissionless collaboratively trained language
	model trained entirely from scratch at the 72 billion parameter scale.

	It is being trained with 20+ globally distributed participants coordinated via
	decentralized infrastructure on the Bittensor blockchain.

	Checkpoint-One marks the first release, corresponding to **200 billion
	tokens processed**. Model files are available in the [Checkpoint-One
	branch](https://huggingface.co/tplr/Covenant72B/tree/Checkpoint-One). Future
	checkpoints will be updated here.

	![Checkpoint One](assets/checkpoint-one.webp)

	---

	## Training Details

	\| Property \| Value \|
	\|-----------\|--------\|
	\| Model size \| 72B \|
	\| Architecture \| LLaMA-style \|
	\| Target token budget \| 1.2T (210B for current checkpoint) \|
	\| Compute participants \| 20+ \|
	\| Minimal compute per participant \| 8×B200 or equivalent \|
	\| Dataset \| DCLM-baseline \|
	\| Optimizer \| SparseLoCo (communication-efficient optimizer) \|

	---

	## Performance on Benchmarks
	_All results are 0-shot acc-norm (%)_

	\| Model \| Compute Environment / Permissions \| Size \| Tokens \| ARC-C \| ARC-E \| PIQA \| OpenBookQA \| HellaSwag \| Winogrande \| MMLU \|
	\|:------\|:----------------------------------\|------:\|--------:\|------:\|------:\|------:\|------------:\|-----------:\|-------------:\|------:\|
	\| Intellect-1 \| Over the internet / White List \| 10B \| 1T \| 44.8 \| 71.6 \| 77.7 \| 43.6 \| 70.5 \| 63.1 \| 32.7 \|
	\| Psyche Consilience-7Y9 \| Over the internet / White List \| 40B \| 1.2T \| 31.1 \| 55.8 \| 76.1 \| 34.8 \| 63.7 \| 57.0 \| 24.2 \|
	\| Covenant72B – Checkpoint One \| Over the internet / Permissionless \| 70B \| 210B \| 46.2 \| 72.6 \| 79.2 \| 43.0 \| 73.5 \| 70.3 \| 38.0 \|
	\| K2 Checkpoint 54 \| Centralized Cluster \| 65B \| 210B \| 41.8 \| 69.5 \| 80.1 \| 42.4 \| 74.9 \| 68.9 \| 33.7 \|

	---

	For more details, refer to [Checkpoint One on Templar Research](https://templarresearch.substack.com/p/checkpoint-one).