--- base_model: - grimjim/Magnolia-v3-medis-remix-12B library_name: transformers pipeline_tag: text-generation license: apache-2.0 base_model_relation: quantized quanted_by: grimjim --- # Magnolia-v3-medis-remix-12B-GGUF These are GGUF quants of a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). In addition to Nemo Instruct being a major component, a medical fine-tune was incorporated as a "noise" component. ## Chat Template The underlying Mistral Nemo 2407 model was tuned to work with Mistral's Tekken Instruct Chat Template, which is close to their Tokenizer V3. ``` {{ bos_token }} {% for message in messages %} {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %} {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }} {% endif %} {% if message['role'] == 'user' %} {{ '[INST]' + message['content'] + '[/INST]' }} {% elif message['role'] == 'assistant' %} {{ message['content'] + eos_token }} {% else %} {{ raise_exception('Only user and assistant roles are supported!') }} {% endif %} {% endfor %} ``` Refer to [Demystifying Mistral's Instruct Tokenization & Chat Templates](https://github.com/mistralai/cookbook/blob/main/concept-deep-dive/tokenization/chat_templates.md) for more details. ## Merge Details ### Merge Method This model was merged using the [Task Arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/mistralai-Mistral-Nemo-Base-2407](https://huggingface.co/grimjim/mistralai-Mistral-Nemo-Base-2407) as a base. ### Models Merged The following models were included in the merge: * [grimjim/magnum-consolidatum-v1-12b](https://huggingface.co/grimjim/magnum-consolidatum-v1-12b) * [exafluence/EXF-Medistral-Nemo-12B](https://huggingface.co/exafluence/EXF-Medistral-Nemo-12B) * [nbeerbower/Mistral-Nemo-Prism-12B](https://huggingface.co/nbeerbower/Mistral-Nemo-Prism-12B) * [grimjim/magnum-twilight-12b](https://huggingface.co/grimjim/magnum-twilight-12b) * [grimjim/mistralai-Mistral-Nemo-Instruct-2407](https://huggingface.co/grimjim/mistralai-Mistral-Nemo-Instruct-2407) ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: grimjim/mistralai-Mistral-Nemo-Base-2407 dtype: bfloat16 merge_method: task_arithmetic parameters: normalize: true slices: - sources: - layer_range: [0, 40] model: grimjim/mistralai-Mistral-Nemo-Base-2407 - layer_range: [0, 40] model: grimjim/mistralai-Mistral-Nemo-Instruct-2407 parameters: weight: 0.9 - layer_range: [0, 40] model: grimjim/magnum-consolidatum-v1-12b parameters: weight: 0.1 - layer_range: [0, 40] model: grimjim/magnum-twilight-12b parameters: weight: 0.001 - layer_range: [0, 40] model: exafluence/EXF-Medistral-Nemo-12B parameters: weight: 0.000001 - layer_range: [0, 40] model: nbeerbower/Mistral-Nemo-Prism-12B parameters: weight: 0.05 ```