Creativity ITI for LLaMA 3.1 8B Instruct (v2.0)

πŸ”„ Major Update: Improved Training & Optimization

What's New

  • Correct Training Method: Now extracts activations from complete code solutions (not just prompts)
  • New Optimal Ξ±: 0.1 (previously 0.4)
  • Efficient Design: Uses only top 11 heads (previously 48) with similar performance
  • Better Signal: Trained on how model perceives creativity in existing solutions

Key Improvements

Metric Previous Current
Alpha (Ξ±) 0.4 0.1
Active Heads 48 11
Training Method Prompt-only Full solutions

πŸš€ Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model with auto-ITI
model = AutoModelForCausalLM.from_pretrained(
    "syed-aliredha/llama-31-8b-creativity-iti-full",
    trust_remote_code=True,  # Enables automatic ITI
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "syed-aliredha/llama-31-8b-creativity-iti-full"
)

# Generate creative code (ITI automatically applied with Ξ±=0.1)
prompt = "Write a function to check if a number is prime"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.8)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

πŸ“Š Technical Details

Training Methodology

  1. Data: NeoCoder dataset with creativity labels
  2. Activations: Extracted from model processing complete solutions
  3. Labels: Based on novel technique usage vs human solutions
  4. Probes: Linear classifiers on each attention head
  5. Selection: Top 11 heads by AUC score
  6. Direction: Center-of-mass between creative/non-creative

Why Only 11 Heads?

  • Pareto principle: 80% of effect from 20% of heads
  • Reduces computational overhead significantly
  • Maintains creativity enhancement quality
  • Faster inference with minimal quality loss

πŸ“ˆ Performance

  • Uses efficient subset of most predictive heads
  • ~4x faster intervention application
  • Maintains creativity enhancement effectiveness

πŸ”§ Custom Parameters

If you want to adjust parameters locally:

from huggingface_hub import hf_hub_download
import pickle

# Download components
top_heads = pickle.load(open(hf_hub_download(repo_id, "iti_top_heads.pkl", repo_type="model"), 'rb'))
directions = pickle.load(open(hf_hub_download(repo_id, "iti_directions.pkl", repo_type="model"), 'rb'))

# Apply with custom alpha
custom_alpha = 0.2  # Your value

πŸ“š Citation

Based on: Li et al., "Inference-Time Intervention: Eliciting Truthful Answers from a Language Model" (NeurIPS 2023)

πŸ™ Acknowledgments

  • NSCC Singapore for compute resources
  • NeoCoder dataset creators
  • Meta AI for LLaMA 3.1
Downloads last month
2
Safetensors
Model size
8B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for syed-aliredha/llama-31-8b-creativity-iti-full

Finetuned
(1973)
this model