Update README.md
Browse files
README.md
CHANGED
|
@@ -41,15 +41,13 @@ tags:
|
|
| 41 |
- STEM
|
| 42 |
- unsloth
|
| 43 |
---
|
| 44 |
-
|
| 45 |
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
*Maverick generated this model card!*
|
| 49 |
|
| 50 |
## **Model Overview**
|
| 51 |
|
| 52 |
-
**
|
| 53 |
|
| 54 |
## **Model Details**
|
| 55 |
|
|
@@ -66,33 +64,33 @@ tags:
|
|
| 66 |
|
| 67 |
## **Training Details**
|
| 68 |
|
| 69 |
-
|
| 70 |
|
| 71 |
## **Intended Use**
|
| 72 |
|
| 73 |
-
|
| 74 |
|
| 75 |
- **STEM Reasoning:** Assisting with complex problem-solving and theoretical explanations.
|
| 76 |
- **Academic Assistance:** Supporting tutoring, step-by-step math solutions, and scientific writing.
|
| 77 |
- **General NLP Tasks:** Text generation, summarization, and question answering.
|
| 78 |
- **Data Analysis:** Interpreting and explaining mathematical and statistical data.
|
| 79 |
|
| 80 |
-
While
|
| 81 |
|
| 82 |
## **How to Use**
|
| 83 |
|
| 84 |
-
To utilize
|
| 85 |
|
| 86 |
```bash
|
| 87 |
pip install transformers
|
| 88 |
```
|
| 89 |
|
| 90 |
-
Here's an example of how to load the
|
| 91 |
|
| 92 |
```python
|
| 93 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 94 |
|
| 95 |
-
model_name = "Spestly/
|
| 96 |
model = AutoModelForCausalLM.from_pretrained(
|
| 97 |
model_name,
|
| 98 |
torch_dtype="auto",
|
|
@@ -129,19 +127,18 @@ To use this model with Maverick Search, please refer to this [repository](https:
|
|
| 129 |
|
| 130 |
Users should be aware of the following limitations:
|
| 131 |
|
| 132 |
-
- **Biases:**
|
| 133 |
- **Knowledge Cutoff:** The model's knowledge is current up to August 2024. It may not be aware of events or developments occurring after this date.
|
| 134 |
- **Language Support:** While the model supports multiple languages, performance is strongest in English and technical content.
|
| 135 |
|
| 136 |
## **Acknowledgements**
|
| 137 |
|
| 138 |
-
|
| 139 |
|
| 140 |
## **License**
|
| 141 |
|
| 142 |
-
|
| 143 |
|
| 144 |
## **Contact**
|
| 145 |
|
| 146 |
-
- Email: [email protected]
|
| 147 |
-
|
|
|
|
| 41 |
- STEM
|
| 42 |
- unsloth
|
| 43 |
---
|
| 44 |
+
# **Athena-3-7B Model Card**
|
| 45 |
|
| 46 |
+
*Athena generated this model card!*
|
|
|
|
|
|
|
| 47 |
|
| 48 |
## **Model Overview**
|
| 49 |
|
| 50 |
+
**Athena-3-7B** is a 7.68-billion-parameter causal language model fine-tuned from Qwen2.5-Math-7B. This model is designed to excel in STEM reasoning, mathematics, and natural language processing tasks, offering advanced instruction-following and problem-solving capabilities.
|
| 51 |
|
| 52 |
## **Model Details**
|
| 53 |
|
|
|
|
| 64 |
|
| 65 |
## **Training Details**
|
| 66 |
|
| 67 |
+
Athena-3-7B was fine-tuned using the Unsloth framework on a single NVIDIA A100 GPU. The fine-tuning process spanned approximately 90 minutes over 60 epochs, utilizing a curated dataset focused on instruction-following, problem-solving, and advanced mathematics. This approach enhances the model's capabilities in academic and analytical tasks.
|
| 68 |
|
| 69 |
## **Intended Use**
|
| 70 |
|
| 71 |
+
Athena-3-7B is designed for a range of applications, including but not limited to:
|
| 72 |
|
| 73 |
- **STEM Reasoning:** Assisting with complex problem-solving and theoretical explanations.
|
| 74 |
- **Academic Assistance:** Supporting tutoring, step-by-step math solutions, and scientific writing.
|
| 75 |
- **General NLP Tasks:** Text generation, summarization, and question answering.
|
| 76 |
- **Data Analysis:** Interpreting and explaining mathematical and statistical data.
|
| 77 |
|
| 78 |
+
While Athena-3-7B is a powerful tool for various applications, it is not intended for real-time, safety-critical systems or for processing sensitive personal information.
|
| 79 |
|
| 80 |
## **How to Use**
|
| 81 |
|
| 82 |
+
To utilize Athena-3-7B, ensure that you have the latest version of the `transformers` library installed:
|
| 83 |
|
| 84 |
```bash
|
| 85 |
pip install transformers
|
| 86 |
```
|
| 87 |
|
| 88 |
+
Here's an example of how to load the Athena-3-7B model and generate a response:
|
| 89 |
|
| 90 |
```python
|
| 91 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 92 |
|
| 93 |
+
model_name = "Spestly/Athena-3-7B"
|
| 94 |
model = AutoModelForCausalLM.from_pretrained(
|
| 95 |
model_name,
|
| 96 |
torch_dtype="auto",
|
|
|
|
| 127 |
|
| 128 |
Users should be aware of the following limitations:
|
| 129 |
|
| 130 |
+
- **Biases:** Athena-3-7B may exhibit biases present in its training data. Users should critically assess outputs, especially in sensitive contexts.
|
| 131 |
- **Knowledge Cutoff:** The model's knowledge is current up to August 2024. It may not be aware of events or developments occurring after this date.
|
| 132 |
- **Language Support:** While the model supports multiple languages, performance is strongest in English and technical content.
|
| 133 |
|
| 134 |
## **Acknowledgements**
|
| 135 |
|
| 136 |
+
Athena-3-7B builds upon the work of the Qwen team. Gratitude is also extended to the open-source AI community for their contributions to tools and frameworks that facilitated the development of Athena-3-7B.
|
| 137 |
|
| 138 |
## **License**
|
| 139 |
|
| 140 |
+
Athena-3-7B is released under the MIT License, permitting wide usage with proper attribution.
|
| 141 |
|
| 142 |
## **Contact**
|
| 143 |
|
| 144 |
+
- Email: [email protected]
|
|
|