Intel NPU
Collection
Latest SOTA models supported on Intel NPU
•
6 items
•
Updated
Run Llama-3.1-8B optimized for Intel NPUs with nexaSDK.
Install nexaSDK and create a free account at sdk.nexa.ai
Activate your device with your access token:
nexa config set license '<access_token>'
Run the model on NPU in one line:
nexa infer NexaAI/llama-3.1-8B-intel-npu
Llama-3.1-8B is a mid-sized model in the Llama 3.1 family, balancing strong reasoning and language understanding with efficient deployment.
At 8B parameters, it offers significantly higher accuracy and fluency than smaller Llama models, while remaining practical for fine-tuning and inference on modern GPUs.
Input: Text prompts—questions, instructions, or code snippets.
Output: Natural language responses including answers, explanations, structured outputs, or code.