Spestly commited on
Commit
23728a1
·
verified ·
1 Parent(s): 9e2e686

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -16
README.md CHANGED
@@ -41,15 +41,13 @@ tags:
41
  - STEM
42
  - unsloth
43
  ---
44
- ![Header](Maverick.png)
45
 
46
- # **Maverick-1-7B Model Card**
47
-
48
- *Maverick generated this model card!*
49
 
50
  ## **Model Overview**
51
 
52
- **Maverick-1-7B** is a 7.68-billion-parameter causal language model fine-tuned from Qwen2.5-Math-7B. This model is designed to excel in STEM reasoning, mathematics, and natural language processing tasks, offering advanced instruction-following and problem-solving capabilities.
53
 
54
  ## **Model Details**
55
 
@@ -66,33 +64,33 @@ tags:
66
 
67
  ## **Training Details**
68
 
69
- Maverick-1-7B was fine-tuned using the Unsloth framework on a single NVIDIA A100 GPU. The fine-tuning process spanned approximately 90 minutes over 60 epochs, utilizing a curated dataset focused on instruction-following, problem-solving, and advanced mathematics. This approach enhances the models capabilities in academic and analytical tasks.
70
 
71
  ## **Intended Use**
72
 
73
- Maverick-1-7B is designed for a range of applications, including but not limited to:
74
 
75
  - **STEM Reasoning:** Assisting with complex problem-solving and theoretical explanations.
76
  - **Academic Assistance:** Supporting tutoring, step-by-step math solutions, and scientific writing.
77
  - **General NLP Tasks:** Text generation, summarization, and question answering.
78
  - **Data Analysis:** Interpreting and explaining mathematical and statistical data.
79
 
80
- While Maverick-1-7B is a powerful tool for various applications, it is not intended for real-time, safety-critical systems or for processing sensitive personal information.
81
 
82
  ## **How to Use**
83
 
84
- To utilize Maverick-1-7B, ensure that you have the latest version of the `transformers` library installed:
85
 
86
  ```bash
87
  pip install transformers
88
  ```
89
 
90
- Here's an example of how to load the Maverick-1-7B model and generate a response:
91
 
92
  ```python
93
  from transformers import AutoModelForCausalLM, AutoTokenizer
94
 
95
- model_name = "Spestly/Maverick-1-7B"
96
  model = AutoModelForCausalLM.from_pretrained(
97
  model_name,
98
  torch_dtype="auto",
@@ -129,19 +127,18 @@ To use this model with Maverick Search, please refer to this [repository](https:
129
 
130
  Users should be aware of the following limitations:
131
 
132
- - **Biases:** Maverick-1-7B may exhibit biases present in its training data. Users should critically assess outputs, especially in sensitive contexts.
133
  - **Knowledge Cutoff:** The model's knowledge is current up to August 2024. It may not be aware of events or developments occurring after this date.
134
  - **Language Support:** While the model supports multiple languages, performance is strongest in English and technical content.
135
 
136
  ## **Acknowledgements**
137
 
138
- Maverick-1-7B builds upon the work of the Qwen team. Gratitude is also extended to the open-source AI community for their contributions to tools and frameworks that facilitated the development of Maverick-1-7B.
139
 
140
  ## **License**
141
 
142
- Maverick-1-7B is released under the MIT License, permitting wide usage with proper attribution.
143
 
144
  ## **Contact**
145
 
146
- - Email: [email protected]
147
-
 
41
  - STEM
42
  - unsloth
43
  ---
44
+ # **Athena-3-7B Model Card**
45
 
46
+ *Athena generated this model card!*
 
 
47
 
48
  ## **Model Overview**
49
 
50
+ **Athena-3-7B** is a 7.68-billion-parameter causal language model fine-tuned from Qwen2.5-Math-7B. This model is designed to excel in STEM reasoning, mathematics, and natural language processing tasks, offering advanced instruction-following and problem-solving capabilities.
51
 
52
  ## **Model Details**
53
 
 
64
 
65
  ## **Training Details**
66
 
67
+ Athena-3-7B was fine-tuned using the Unsloth framework on a single NVIDIA A100 GPU. The fine-tuning process spanned approximately 90 minutes over 60 epochs, utilizing a curated dataset focused on instruction-following, problem-solving, and advanced mathematics. This approach enhances the model's capabilities in academic and analytical tasks.
68
 
69
  ## **Intended Use**
70
 
71
+ Athena-3-7B is designed for a range of applications, including but not limited to:
72
 
73
  - **STEM Reasoning:** Assisting with complex problem-solving and theoretical explanations.
74
  - **Academic Assistance:** Supporting tutoring, step-by-step math solutions, and scientific writing.
75
  - **General NLP Tasks:** Text generation, summarization, and question answering.
76
  - **Data Analysis:** Interpreting and explaining mathematical and statistical data.
77
 
78
+ While Athena-3-7B is a powerful tool for various applications, it is not intended for real-time, safety-critical systems or for processing sensitive personal information.
79
 
80
  ## **How to Use**
81
 
82
+ To utilize Athena-3-7B, ensure that you have the latest version of the `transformers` library installed:
83
 
84
  ```bash
85
  pip install transformers
86
  ```
87
 
88
+ Here's an example of how to load the Athena-3-7B model and generate a response:
89
 
90
  ```python
91
  from transformers import AutoModelForCausalLM, AutoTokenizer
92
 
93
+ model_name = "Spestly/Athena-3-7B"
94
  model = AutoModelForCausalLM.from_pretrained(
95
  model_name,
96
  torch_dtype="auto",
 
127
 
128
  Users should be aware of the following limitations:
129
 
130
+ - **Biases:** Athena-3-7B may exhibit biases present in its training data. Users should critically assess outputs, especially in sensitive contexts.
131
  - **Knowledge Cutoff:** The model's knowledge is current up to August 2024. It may not be aware of events or developments occurring after this date.
132
  - **Language Support:** While the model supports multiple languages, performance is strongest in English and technical content.
133
 
134
  ## **Acknowledgements**
135
 
136
+ Athena-3-7B builds upon the work of the Qwen team. Gratitude is also extended to the open-source AI community for their contributions to tools and frameworks that facilitated the development of Athena-3-7B.
137
 
138
  ## **License**
139
 
140
+ Athena-3-7B is released under the MIT License, permitting wide usage with proper attribution.
141
 
142
  ## **Contact**
143
 
144
+ - Email: [email protected]