Unsloth Whisper Large V3 Turbo - Pruna 8bit Optimized

This model is a Pruna-optimized version of openai/whisper-large-v3-turbo with 8-bit quantization optimizations.

Optimizations Applied

Batcher Optimization: int8 enabled (whisper_s2t_int8: True)
Compiler: c_whisper
Batcher: whisper_s2t

Usage

Option 1: Standard Transformers (Recommended for most users)

from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor

# Simple loading - no Pruna installation required
model = AutoModelForSpeechSeq2Seq.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-8bit")
processor = AutoProcessor.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-8bit")

# Use normally
result = model.generate(inputs, ...)

Option 2: With Pruna Optimization (Maximum Performance)

from pruna import smash, SmashConfig
from transformers import AutoModelForSpeechSeq2Seq, AutoTokenizer, AutoProcessor
import json

# Load model and tokenizer
model = AutoModelForSpeechSeq2Seq.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-8bit")
tokenizer = AutoTokenizer.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-8bit")
processor = AutoProcessor.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-8bit")

# Load SmashConfig
with open("smash_config.json", "r") as f:
    config_dict = json.load(f)

# Recreate SmashConfig
smash_config = SmashConfig()
for key, value in config_dict.items():
    smash_config[key] = value

# Apply Pruna optimizations
smashed_model = smash(
    model=model,
    smash_config=smash_config
)

# Use the optimized model
result = smashed_model.inference(audio_input)

Performance Benefits

Reduced memory usage from 8-bit weight quantization
Optimized inference pipeline with int8 batcher
Maintained audio transcription quality

Base Model

This model is based on unsloth/whisper-large-v3-turbo, which itself is optimized from openai/whisper-large-v3-turbo. It retains all the capabilities of both base models while providing additional Pruna performance improvements.

Downloads last month: 4

Safetensors

Model size

0.8B params

Tensor type

F16

Model tree for manohar03/unsloth-whisper-large-v3-turbo-pruna-8bit

Base model

openai/whisper-large-v3

Finetuned

unsloth/whisper-large-v3-turbo

Finetuned

(54)

this model