k050506koch commited on
Commit
758903e
·
verified ·
1 Parent(s): 812346b

Created README

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -1,3 +1,37 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - HuggingFaceFW/fineweb
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+
9
+ widget:
10
+ - text: "He is a doctor. His main goal is"
11
+ example_title: " to help people."
12
+ - text: "My name is Merve and my favorite"
13
+ example_title: "activity is reading."
14
+ ---
15
+ # GPT3
16
+
17
+ Welcome to the GPT3 repository! This project is an attempt to recreate the architecture and approach from the original OpenAI GPT-3 paper. The repository includes scripts for training, fine-tuning, and inference of a GPT-3-like model using PyTorch and the Hugging Face Transformers library.
18
+ Here are located weights of dev checkpoints of my models. You can always download a folder, paste it's path inside inference.py and chat with them.
19
+
20
+ # **You can find all code on [GitHub](https://github.com/krll-corp/GPT3)**
21
+ # Note: This is a model with 125 million parameters. It was trained on 3.6Bn tokens. (Of course, it's very undertrained, but this one should be a technology demonstrator.)
22
+ # Note 2: This is a model checkpoint released on 06/12 2024 and has been trained for longer (12 batch size, 4 grad accumulation, 512 tokens and 600,000 steps). It scores 27.65% on MMLU which is slightly higher than 25% (random guess)
23
+ ## Contributing
24
+
25
+ Contributions are welcome! I'm just a student who is interested in AI so my code may be incorrect or have logical issues. Please open an issue or submit a pull request for any improvements or bug fixes, I will be happy.
26
+
27
+ ## License
28
+
29
+ This project is licensed under the MIT License. See the LICENSE file for details. Everyone can use and modify this code at their discretion.
30
+
31
+ ## Acknowledgements
32
+
33
+ Thanks OpenAI, HuggingFace and Pytorch for making this project possible!
34
+
35
+ - [OpenAI GPT-3 Paper](https://arxiv.org/abs/2005.14165)
36
+ - [Hugging Face Transformers](https://github.com/huggingface/transformers)
37
+ - [PyTorch](https://pytorch.org/)