3rd-Degree-Burn commited on
Commit
b939b84
·
verified ·
1 Parent(s): 9e12ab6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -11,7 +11,19 @@ tags:
11
 
12
  # Llama-3.1-8B-Squareroot
13
 
14
- Llama-3.1-8B-Squareroot is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
15
  * [NousResearch/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3.1-8B-Instruct)
16
  * [EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta](https://huggingface.co/EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta)
17
  * [nvidia/OpenMath2-Llama3.1-8B](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B)
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  # Llama-3.1-8B-Squareroot
13
 
14
+ This is a TIES merge that combines the performance of the following models:
15
  * [NousResearch/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3.1-8B-Instruct)
16
  * [EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta](https://huggingface.co/EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta)
17
  * [nvidia/OpenMath2-Llama3.1-8B](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B)
18
+
19
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6479f6dbed75e95d3e97bb4d/LpWI-ug9WZdpcrjBy44iw.png)
20
+
21
+ # Description
22
+
23
+ I observed that when a model is trained to do just math, it does badly on everything else. So my plan was to merge a “math” model with a strong reasoning/inference model and a general instruction-following model. The result should be a model that's steerable (able to follow instructions) and still good at math.
24
+
25
+ # Examples
26
+
27
+ # Benchmarks
28
+
29
+ Coming very soon!