QuantFactory/instinct-GGUF

This is quantized version of continuedev/instinct created using llama.cpp

Original Model Card

Instinct, the State-of-the-Art Open Next Edit Model

This repo contains the model weights for Continue's state-of-the-art open Next Edit model, Instinct. Robustly fine-tuned from Qwen2.5-Coder-7B on our dataset of real-world code edits, Instinct intelligently predicts your next move to keep you in flow.

Serving the model

Ollama: We've released a Q4_K_M GGUF quantization of Instinct for efficient local inference. Try it with Continue's Ollama integration, or just run ollama run nate/instinct.

You can also serve the model using either of the below options, then connect it with Continue.

SGLang: python3 -m sglang.launch_server --model-path continuedev/instinct --load-format safetensors
vLLM: vllm serve continuedev/instinct --served-model-name instinct --load-format safetensors

Learn more

For more information on the work behind Instinct, please refer to our blog.

Downloads last month: 185

GGUF

Model size

8B params

Architecture

qwen2

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Model tree for QuantFactory/instinct-GGUF

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-Coder-7B

Quantized

(37)

this model

QuantFactory
/

instinct-GGUF