Created README
Browse files
README.md
CHANGED
|
@@ -1,3 +1,37 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- HuggingFaceFW/fineweb
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
|
| 9 |
+
widget:
|
| 10 |
+
- text: "He is a doctor. His main goal is"
|
| 11 |
+
example_title: " to help people."
|
| 12 |
+
- text: "My name is Merve and my favorite"
|
| 13 |
+
example_title: "activity is reading."
|
| 14 |
+
---
|
| 15 |
+
# GPT3
|
| 16 |
+
|
| 17 |
+
Welcome to the GPT3 repository! This project is an attempt to recreate the architecture and approach from the original OpenAI GPT-3 paper. The repository includes scripts for training, fine-tuning, and inference of a GPT-3-like model using PyTorch and the Hugging Face Transformers library.
|
| 18 |
+
Here are located weights of dev checkpoints of my models. You can always download a folder, paste it's path inside inference.py and chat with them.
|
| 19 |
+
|
| 20 |
+
# **You can find all code on [GitHub](https://github.com/krll-corp/GPT3)**
|
| 21 |
+
# Note: This is a model with 125 million parameters. It was trained on 3.6Bn tokens. (Of course, it's very undertrained, but this one should be a technology demonstrator.)
|
| 22 |
+
# Note 2: This is a model checkpoint released on 06/12 2024 and has been trained for longer (12 batch size, 4 grad accumulation, 512 tokens and 600,000 steps). It scores 27.65% on MMLU which is slightly higher than 25% (random guess)
|
| 23 |
+
## Contributing
|
| 24 |
+
|
| 25 |
+
Contributions are welcome! I'm just a student who is interested in AI so my code may be incorrect or have logical issues. Please open an issue or submit a pull request for any improvements or bug fixes, I will be happy.
|
| 26 |
+
|
| 27 |
+
## License
|
| 28 |
+
|
| 29 |
+
This project is licensed under the MIT License. See the LICENSE file for details. Everyone can use and modify this code at their discretion.
|
| 30 |
+
|
| 31 |
+
## Acknowledgements
|
| 32 |
+
|
| 33 |
+
Thanks OpenAI, HuggingFace and Pytorch for making this project possible!
|
| 34 |
+
|
| 35 |
+
- [OpenAI GPT-3 Paper](https://arxiv.org/abs/2005.14165)
|
| 36 |
+
- [Hugging Face Transformers](https://github.com/huggingface/transformers)
|
| 37 |
+
- [PyTorch](https://pytorch.org/)
|