---
license: mit
tags:
- text-classification
- cheese
- texture
- distilbert
- transformers
- fine-tuned
datasets:
- aslan-ng/cheese-text
metrics:
- accuracy
model-index:
- name: Cheese Texture Classifier (DistilBERT)
  results:
  - task:
      type: text-classification
      name: Cheese Texture Classification
    dataset:
      type: aslan-ng/cheese-text
      name: Cheese Text Dataset
    metrics:
    - type: accuracy
      value: 0.400
      name: Test Accuracy
---

# Cheese Texture Classifier (DistilBERT)

**Model Creator**: Rumi Loghmani (@rlogh)  
**Original Dataset**: aslan-ng/cheese-text (by Aslan Noorghasemi)

This model performs 4-class texture classification on cheese descriptions using fine-tuned DistilBERT.

## Model Description

- **Architecture**: DistilBERT-base-uncased fine-tuned for sequence classification
- **Task**: 4-class texture classification (hard, semi-hard, semi-soft, soft)
- **Input**: Cheese description text (up to 512 tokens)
- **Output**: 4-class probability distribution

## Training Details

### Data
- **Dataset**: [aslan-ng/cheese-text](https://huggingface.co/datasets/aslan-ng/cheese-text) (original split: 100 samples)
- **Train/Val/Test Split**: 70/15/15 (stratified)
- **Text Source**: Cheese descriptions from the dataset
- **Labels**: Texture categories (hard, semi-hard, semi-soft, soft)

### Preprocessing
- **Tokenization**: DistilBERT tokenizer with 512 max length
- **Padding**: Max length padding
- **Truncation**: Long descriptions truncated to 512 tokens

### Training Setup
- **Model**: distilbert-base-uncased
- **Epochs**: 10
- **Batch Size**: 8 (train/val)
- **Learning Rate**: 2e-5
- **Warmup Steps**: 10
- **Weight Decay**: 0.01
- **Optimizer**: AdamW
- **Scheduler**: Linear warmup + linear decay
- **Mixed Precision**: FP16 (if GPU available)
- **Seed**: 42 (for reproducibility)

### Hardware/Compute
- **Training Device**: CPU
- **Training Time**: ~5-10 minutes on GPU
- **Model Size**: ~67M parameters
- **Memory Usage**: ~2-4GB GPU memory

## Performance

- **Test Accuracy**: 0.400
- **Test Loss**: 1.290

### Class-wise Performance
              precision    recall  f1-score   support

        hard       0.50      0.33      0.40         3
   semi-hard       0.29      0.50      0.36         4
   semi-soft       0.40      0.50      0.44         4
        soft       1.00      0.25      0.40         4

    accuracy                           0.40        15
   macro avg       0.55      0.40      0.40        15
weighted avg       0.55      0.40      0.40        15


## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "rlogh/cheese-texture-classifier-distilbert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example prediction
text = "Feta is a crumbly, tangy Greek cheese with a salty bite and creamy undertones."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1).item()

class_names = ["hard", "semi-hard", "semi-soft", "soft"]
print(f"Predicted texture: {class_names[predicted_class]}")
```

## Class Definitions

- **Hard**: Firm, aged cheeses that are dense and can be grated (e.g., Parmesan, Cheddar)
- **Semi-hard**: Moderately firm cheeses with some flexibility (e.g., Gouda, Swiss)
- **Semi-soft**: Cheeses with some give but maintain shape (e.g., Mozzarella, Blue cheese)
- **Soft**: Creamy, spreadable cheeses (e.g., Brie, Camembert, Cottage cheese)

## Limitations and Ethics

### Limitations
- **Small Dataset**: Trained on only 100 samples, limiting generalization
- **Text Quality**: Performance depends on description quality and consistency
- **Subjective Labels**: Texture classification has inherent subjectivity
- **Domain Specific**: Only applicable to cheese texture classification
- **Language**: English-only model

### Ethical Considerations
- **Bias**: Model may reflect biases in the original dataset
- **Cultural Context**: Cheese descriptions may be culturally specific
- **Commercial Use**: Not intended for commercial cheese production decisions
- **Accuracy**: Should not be used for critical food safety applications

### Recommendations
- Use for educational/research purposes only
- Validate predictions with domain experts
- Consider cultural context when applying to different regions
- Retrain with larger, more diverse datasets for production use

## AI Usage Disclosure

This model was developed using:
- **Base Model**: DistilBERT (distilbert-base-uncased)
- **Training Framework**: Hugging Face Transformers
- **Fine-tuning**: Standard BERT fine-tuning techniques
- The AI acted as a collaborative partner throughout the development process, accelerating the coding workflow and providing helpful guidance.


## Citation

**Model Citation:**
```bibtex
@model{rlogh/cheese-texture-classifier-distilbert,
  title={Cheese Texture Classifier (DistilBERT)},
  author={Rumi Loghmani},
  year={2024},
  url={https://huggingface.co/rlogh/cheese-texture-classifier-distilbert}
}
```

**Dataset Citation:**
```bibtex
@dataset{aslan-ng/cheese-text,
  title={Cheese Text Dataset},
  author={Aslan Noorghasemi},
  year={2024},
  url={https://huggingface.co/datasets/aslan-ng/cheese-text}
}
```

## License

MIT License - See LICENSE file for details.