Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ library_name: transformers
|
|
| 17 |
> FlexOlmo-7x7B-1T (without router training) is a Mixture-of-Experts with 33B total parameters, combining independently trained experts on public-mix, news, math, code, academic texts, creative writing, and Reddit data. The public-mix expert is trained on 1T tokens of public data while the other experts are branched from the public-mix expert and trained on 50B tokens of their respective data.
|
| 18 |
|
| 19 |
This information and more can also be found:
|
| 20 |
-
- **Paper**: https://allenai.org/papers/
|
| 21 |
- **Code**: https://github.com/allenai/FlexOlmo
|
| 22 |
- **Blog**: https://allenai.org/blog/flexolmo
|
| 23 |
- **Data and corresponding models**:
|
|
@@ -72,6 +72,6 @@ print(tokenizer.decode(out[0]))
|
|
| 72 |
eprint={2507.00000},
|
| 73 |
archivePrefix={arXiv},
|
| 74 |
primaryClass={cs.CL},
|
| 75 |
-
url={https://allenai.org/papers/
|
| 76 |
}
|
| 77 |
```
|
|
|
|
| 17 |
> FlexOlmo-7x7B-1T (without router training) is a Mixture-of-Experts with 33B total parameters, combining independently trained experts on public-mix, news, math, code, academic texts, creative writing, and Reddit data. The public-mix expert is trained on 1T tokens of public data while the other experts are branched from the public-mix expert and trained on 50B tokens of their respective data.
|
| 18 |
|
| 19 |
This information and more can also be found:
|
| 20 |
+
- **Paper**: https://allenai.org/papers/flexolmo
|
| 21 |
- **Code**: https://github.com/allenai/FlexOlmo
|
| 22 |
- **Blog**: https://allenai.org/blog/flexolmo
|
| 23 |
- **Data and corresponding models**:
|
|
|
|
| 72 |
eprint={2507.00000},
|
| 73 |
archivePrefix={arXiv},
|
| 74 |
primaryClass={cs.CL},
|
| 75 |
+
url={https://allenai.org/papers/flexolmo},
|
| 76 |
}
|
| 77 |
```
|