imagenet_vqvae / README.md
ianisdev's picture
Create README.md
066bef2 verified
---
license: mit
tags:
- vqvae
- image-generation
- unsupervised-learning
- pytorch
- imagenet
- generative-model
datasets:
- imagenet-200
library_name: pytorch
model-index:
- name: VQ-VAE-ImageNet200
results:
- task:
type: image-generation
name: Image Generation
dataset:
name: Tiny ImageNet (ImageNet-200)
type: image-classification
metrics:
- name: FID
type: frechet-inception-distance
value: 102.87
---
# VQ-VAE for Tiny ImageNet (ImageNet-200)
This repository contains a **Vector Quantized Variational Autoencoder (VQ-VAE)** trained on the Tiny ImageNet-200 dataset using PyTorch. It is part of an image augmentation and representation learning pipeline for generative modeling and unsupervised learning tasks.
---
## 🧠 Model Details
- **Model Type**: Vector Quantized Variational Autoencoder (VQ-VAE)
- **Dataset**: Tiny ImageNet (ImageNet-200)
- **Epochs**: 35
- **Latent Space**: Discrete codebook (vector quantization)
- **Input Size**: 64Γ—64 RGB
- **Loss Function**: Mean Squared Error (MSE) + VQ commitment loss
- **Final Training Loss**: ~0.0292
- **FID Score**: ~102.87
- **Architecture**: 3-layer CNN Encoder & Decoder with quantization bottleneck
---
## πŸ“¦ Files
- `generator.pt` β€” Trained VQ-VAE model weights
- `loss_curve.png` β€” Plot of training loss across 35 epochs
- `fid_score.json` β€” FID evaluation result on 1000 generated samples
- `fid_real/` β€” 1000 real Tiny ImageNet samples used for FID
- `fid_fake/` β€” 1000 VQ-VAE reconstructions used for FID
---
## πŸ”§ Usage
```python
import torch
from models.vqvae.model import VQVAE
model = VQVAE()
model.load_state_dict(torch.load("generator.pt", map_location="cpu"))
model.eval()