--- license: apache-2.0 language: en library_name: transformers pipeline_tag: text-generation tags: - llama3 - medical - rag - finetuned datasets: - medquad - icliniq - NHS - WEBMD - NIH model_creator: Bernard Kyei-Mensah base_model: meta-llama/Meta-Llama-3-8B inference: true --- # 🩺 MediMaven Llama-3.1-8B (fp16, v1.1) **A domain-adapted Llama-3 fine-tuned on ~150 k high-quality Q&A pairs, merged to full-precision fp16 weights for maximum downstream flexibility.** --- # ✨ Key points | | | |---|---| |**Base model**|Meta-Llama-3-8B| |**Tuning method**|QLoRA (8-bit) → merge to fp16| |**Training data**|Curated MedQuAD v2, scrapped articles from Mayo Clinic, NIH, NHS and WEBMD | |**Intended use**|Medical information retrieval, summarisation, chat| > **Disclaimer** Outputs are *informational* and do **not** constitute medical advice. --- # 🔥 Quick start ```python from transformers import AutoTokenizer, AutoModelForCausalLM tok = AutoTokenizer.from_pretrained("dranreb1660/medimaven-llama3-8b-fp16") model = AutoModelForCausalLM.from_pretrained( "dranreb1660/medimaven-llama3-8b-fp16", torch_dtype="float16", device_map="auto" ) prompt = "Explain first-line treatment for GERD in two sentences." print(tok.decode(model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=64)[0], skip_special_tokens=True)) ``` --- # 📊 Evaluation | Metric | Clean Llama-3 8 B | **MediMaven** | | --------------------------- | ----------------- | ------------- | | Medical MC-QA (exact-match) | 78.4 | **89.7** | | F1 (MedQA-RAG) † | 0.71 | **0.83** | # 🛠️ How we trained - Built dataset with de-duplicated, source-attributed passages (MedQuAD, Mayo, iCliniq) [check dataset card for more info](https://huggingface.co/datasets/dranreb1660/medimaven-qa-data). - Applied QLoRA (32 → 4 bit) on NVIDIA T4, 3-epoch, LR 3e-5, cosine schedule. - Merged LoRA adapters to fp16; ran AWQ (see separate repo) for prod inference. [Full training notebook](/training/notebooks/llama3_finetune.ipynb) # 🚦 Limitations & bias * Llama-3 license prohibits use in regulated "high-risk" settings. * English-only; no guarantee of safe output in other languages. # ⬆️ Versioning * v1.1 = first public release (merged weights, new tokenizer template). * For lighter deployment see medimaven-llama3-8b-awq # 📜 Citation ```bitbox @misc{medimaven2025llama3, title = {MediMaven Llama-3.1-8B}, author = {Kyei-Mensah, Bernard}, year = {2025}, howpublished = {\url{https://huggingface.co/dranreb1660/medimaven-llama3-8b-fp16}} }