YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Llama-3.1-8B

Run Llama-3.1-8B optimized for Intel NPUs with nexaSDK.

Quickstart

  1. Install nexaSDK and create a free account at sdk.nexa.ai

  2. Activate your device with your access token:

    nexa config set license '<access_token>'
    
  3. Run the model on NPU in one line:

    nexa infer NexaAI/llama-3.1-8B-intel-npu
    

Model Description

Llama-3.1-8B is a mid-sized model in the Llama 3.1 family, balancing strong reasoning and language understanding with efficient deployment.
At 8B parameters, it offers significantly higher accuracy and fluency than smaller Llama models, while remaining practical for fine-tuning and inference on modern GPUs.

Features

  • Balanced scale: 8B parameters provide a strong trade-off between performance and efficiency.
  • Instruction-tuned: Optimized for following prompts, Q&A, and detailed reasoning.
  • Multilingual capabilities: Broad support across global languages.
  • Developer-friendly: Available for fine-tuning, domain adaptation, and integration into custom applications.

Use Cases

  • Conversational AI and digital assistants requiring stronger reasoning.
  • Content generation, summarization, and analysis.
  • Coding help and structured problem solving.
  • Research and prototyping in environments where very large models are impractical.

Inputs and Outputs

Input: Text prompts—questions, instructions, or code snippets.
Output: Natural language responses including answers, explanations, structured outputs, or code.

License

  • Licensed under Meta Llama 3.1 Community License

References

Downloads last month
30
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/llama-3.1-8B-intel-npu