W4A16 GPTQ quantized version of mistralai/Mistral-Large-Instruct-2407

Using intel/auto-round version: v0.8.0

Generation command-line

auto-round --model mistral-large-2407 --scheme "W4A16" --format "auto_gptq" --dataset HuggingFaceH4/ultrachat_200k,claudy-chat-jk --output_dir "./mistral-large-2407-gptq"

Calibration dataset code

@register_dataset(["hell0ks/claudy-chat-JK-1k", "claudy-chat-jk"])
def get_claudy_dataset(
    tokenizer,
    seqlen,
    dataset_name="hell0ks/claudy-chat-JK-1k",
    split=None,
    seed=42,
    apply_chat_template=True,
    system_prompt=None,
):

    dataset = load_dataset("hell0ks/claudy-chat-JK-1k", split="train", streaming=False, trust_remote_code=True)
    dataset = dataset.shuffle(seed=seed).take(1000)

    def is_instruct_tokenizer(tokenizer):
        try:
            out = tokenizer.apply_chat_template([{"role": "user", "content": "Hi"}])
            return bool(out and len(out) > 0)
        except Exception:
            return False

    is_instruct = is_instruct_tokenizer(tokenizer)

    if is_instruct and not apply_chat_template:
        logger.info("Tokenizer looks like an instruct/chat model, but apply_chat_template=False. Setting to True.")
        apply_chat_template = True
    elif not is_instruct and apply_chat_template:
        logger.info("Tokenizer is not an instruct/chat model, but apply_chat_template=True. Setting to False.")
    apply_chat_template = False

    def tokenize_example_batch(examples):
        if not apply_chat_template:
            texts = []
            for message_list in examples["messages"]:
                combined = "".join([msg["content"] for msg in message_list])
                texts.append(combined)
            return tokenizer(texts, truncation=True, max_length=seqlen)
        else:
            return apply_chat_template_to_samples(examples["messages"], tokenizer, seqlen, system_prompt=system_prompt)

    dataset = dataset.map(tokenize_example_batch, batched=True)
    return dataset

Notice

Licensed by Mistral AI under the Mistral AI Research License. You can find copy of license in LICENSE.md.

Downloads last month
4
Safetensors
Model size
0.8B params
Tensor type
I32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hell0ks/Mistral-Large-Instruct-2407-AutoRound-GPTQ-4bit

Quantized
(25)
this model

Datasets used to train hell0ks/Mistral-Large-Instruct-2407-AutoRound-GPTQ-4bit