Text Generation
Transformers
GGUF
conversational

QuantFactory Banner

QuantFactory/instinct-GGUF

This is quantized version of continuedev/instinct created using llama.cpp

Original Model Card

Instinct, the State-of-the-Art Open Next Edit Model

This repo contains the model weights for Continue's state-of-the-art open Next Edit model, Instinct. Robustly fine-tuned from Qwen2.5-Coder-7B on our dataset of real-world code edits, Instinct intelligently predicts your next move to keep you in flow.

Serving the model

Ollama: We've released a Q4_K_M GGUF quantization of Instinct for efficient local inference. Try it with Continue's Ollama integration, or just run ollama run nate/instinct.

You can also serve the model using either of the below options, then connect it with Continue.

SGLang: python3 -m sglang.launch_server --model-path continuedev/instinct --load-format safetensors
vLLM: vllm serve continuedev/instinct --served-model-name instinct --load-format safetensors

Learn more

For more information on the work behind Instinct, please refer to our blog.

Downloads last month
185
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for QuantFactory/instinct-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(37)
this model

Dataset used to train QuantFactory/instinct-GGUF