dranreb1660 commited on
Commit
18fb67b
·
1 Parent(s): aa8f1bc

Added model card readme

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language: en
4
+ library_name: transformers
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - llama3
8
+ - medical
9
+ - rag
10
+ - finetuned
11
+ datasets:
12
+ - medquad
13
+ - icliniq
14
+ - NHS
15
+ - WEBMD
16
+ - NIH
17
+ model_creator: Bernard Kyei-Mensah
18
+ base_model: meta-llama/Meta-Llama-3-8B
19
+ inference: true
20
+ ---
21
+
22
+ # 🩺 MediMaven Llama-3 8 B (fp16, v1.1)
23
+
24
+ **A domain-adapted Llama-3 fine-tuned on ~150 k high-quality Q&A pairs, merged to full-precision fp16 weights for maximum downstream flexibility.**
25
+
26
+ ---
27
+
28
+ # ✨ Key points
29
+ | | |
30
+ |---|---|
31
+ |**Base model**|Meta-Llama-3-8B|
32
+ |**Tuning method**|QLoRA (8-bit) → merge to fp16|
33
+ |**Training data**|Curated MedQuAD v2, scrapped articles from Mayo Clinic, NIH, NHS and WEBMD |
34
+ |**Intended use**|Medical information retrieval, summarisation, chat|
35
+
36
+ > **Disclaimer** Outputs are *informational* and do **not** constitute medical advice.
37
+
38
+ ---
39
+
40
+ # 🔥 Quick start
41
+
42
+ ```python
43
+ from transformers import AutoTokenizer, AutoModelForCausalLM
44
+ tok = AutoTokenizer.from_pretrained("dranreb1660/medimaven-llama3-8b-fp16")
45
+ model = AutoModelForCausalLM.from_pretrained(
46
+ "dranreb1660/medimaven-llama3-8b-fp16",
47
+ torch_dtype="float16",
48
+ device_map="auto"
49
+ )
50
+ prompt = "Explain first-line treatment for GERD in two sentences."
51
+ print(tok.decode(model.generate(**tok(prompt, return_tensors="pt").to(model.device),
52
+ max_new_tokens=64)[0],
53
+ skip_special_tokens=True))
54
+ ```
55
+ ---
56
+ # 📊 Evaluation
57
+ | Metric | Clean Llama-3 8 B | **MediMaven** |
58
+ | --------------------------- | ----------------- | ------------- |
59
+ | Medical MC-QA (exact-match) | 78.4 | **89.7** |
60
+ | F1 (MedQA-RAG) † | 0.71 | **0.83** |
61
+
62
+
63
+ # 🛠️ How we trained
64
+ - Built dataset with de-duplicated, source-attributed passages (MedQuAD, Mayo, iCliniq) [check dataset card for more info](https://huggingface.co/datasets/dranreb1660/medimaven-qa-data).
65
+
66
+ - Applied QLoRA (32 → 4 bit) on NVIDIA T4, 3-epoch, LR 3e-5, cosine schedule.
67
+
68
+ - Merged LoRA adapters to fp16; ran AWQ (see separate repo) for prod inference.
69
+
70
+ [Full training notebook](/training/notebooks/llama3_finetune.ipynb)
71
+
72
+ # 🚦 Limitations & bias
73
+ * Llama-3 license prohibits use in regulated "high-risk" settings.
74
+
75
+ * English-only; no guarantee of safe output in other languages.
76
+
77
+
78
+ # ⬆️ Versioning
79
+ * v1.1 = first public release (merged weights, new tokenizer template).
80
+ * For lighter deployment see medimaven-llama3-8b-awq
81
+
82
+
83
+ # 📜 Citation
84
+ ```bitbox
85
+ @misc{medimaven2025llama3,
86
+ title = {MediMaven Llama-3 8 B},
87
+ author = {Kyei-Mensah, Bernard},
88
+ year = {2025},
89
+ howpublished = {\url{https://huggingface.co/medimaven-ai/medimaven-llama3-8b-fp16}}
90
+ }