Create app.py
Browse files
app.py
ADDED
|
@@ -0,0 +1,822 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Smart Product Cataloger - Gradio App
|
| 3 |
+
Multimodal AI for E-commerce Product Analysis
|
| 4 |
+
|
| 5 |
+
Google Colab - https://colab.research.google.com/drive/1eFNaidx5TPEhXgzdY9hh7EhDcVZm4GMS?usp=sharing
|
| 6 |
+
|
| 7 |
+
This app analyzes product images and generates metadata for e-commerce
|
| 8 |
+
listings using CLIP and BLIP models.
|
| 9 |
+
|
| 10 |
+
OVERVIEW:
|
| 11 |
+
---------
|
| 12 |
+
This application combines zero-shot classification with visual question answering
|
| 13 |
+
to analyze product images for e-commerce. It uses:
|
| 14 |
+
- CLIP for zero-shot product category classification
|
| 15 |
+
- BLIP for image captioning and product description generation
|
| 16 |
+
- BLIP VQA for answering specific questions about product attributes
|
| 17 |
+
|
| 18 |
+
FEATURES:
|
| 19 |
+
---------
|
| 20 |
+
1. AI-powered product category classification using CLIP
|
| 21 |
+
2. Automatic product description generation using BLIP
|
| 22 |
+
3. Category-specific attribute extraction via visual Q&A
|
| 23 |
+
4. Upload and analyze your own product images
|
| 24 |
+
5. Professional e-commerce metadata generation
|
| 25 |
+
|
| 26 |
+
MODELS USED:
|
| 27 |
+
------------
|
| 28 |
+
- openai/clip-vit-base-patch32: Zero-shot image classification
|
| 29 |
+
- Salesforce/blip-image-captioning-base: Image captioning
|
| 30 |
+
- Salesforce/blip-vqa-base: Visual question answering
|
| 31 |
+
|
| 32 |
+
REQUIREMENTS:
|
| 33 |
+
-------------
|
| 34 |
+
- torch
|
| 35 |
+
- gradio
|
| 36 |
+
- transformers
|
| 37 |
+
- PIL (Pillow)
|
| 38 |
+
- requests
|
| 39 |
+
|
| 40 |
+
HOW TO RUN:
|
| 41 |
+
-----------
|
| 42 |
+
1. Install dependencies:
|
| 43 |
+
pip install torch gradio transformers pillow requests
|
| 44 |
+
|
| 45 |
+
2. Run the application:
|
| 46 |
+
python product_cataloger_app.py
|
| 47 |
+
|
| 48 |
+
3. Open your browser and navigate to:
|
| 49 |
+
http://localhost:7860
|
| 50 |
+
|
| 51 |
+
4. Follow the app instructions:
|
| 52 |
+
- Click "Load Models" first (required)
|
| 53 |
+
- Upload product images or use sample URLs
|
| 54 |
+
- Get automatic category classification and metadata
|
| 55 |
+
|
| 56 |
+
USAGE EXAMPLES:
|
| 57 |
+
---------------
|
| 58 |
+
Product Categories Supported:
|
| 59 |
+
- "clothing" - shirts, dresses, pants, etc.
|
| 60 |
+
- "shoes" - sneakers, boots, dress shoes, etc.
|
| 61 |
+
- "electronics" - phones, laptops, gadgets, etc.
|
| 62 |
+
- "furniture" - chairs, tables, sofas, etc.
|
| 63 |
+
- "books" - novels, textbooks, magazines, etc.
|
| 64 |
+
- "toys" - games, dolls, educational toys, etc.
|
| 65 |
+
|
| 66 |
+
The app will automatically classify products and generate relevant
|
| 67 |
+
e-commerce metadata including descriptions and category-specific attributes.
|
| 68 |
+
"""
|
| 69 |
+
|
| 70 |
+
import warnings
|
| 71 |
+
from typing import Dict, List, Union
|
| 72 |
+
|
| 73 |
+
import gradio as gr
|
| 74 |
+
import requests
|
| 75 |
+
import torch
|
| 76 |
+
from PIL import Image
|
| 77 |
+
from transformers import (
|
| 78 |
+
BlipForConditionalGeneration,
|
| 79 |
+
BlipForQuestionAnswering,
|
| 80 |
+
BlipProcessor,
|
| 81 |
+
CLIPModel,
|
| 82 |
+
CLIPProcessor,
|
| 83 |
+
pipeline,
|
| 84 |
+
)
|
| 85 |
+
|
| 86 |
+
# Suppress warnings for cleaner output
|
| 87 |
+
warnings.filterwarnings("ignore")
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
class SmartProductCataloger:
|
| 91 |
+
"""
|
| 92 |
+
Main class for analyzing product images and generating e-commerce metadata.
|
| 93 |
+
|
| 94 |
+
This class integrates CLIP for classification and BLIP for captioning/VQA
|
| 95 |
+
to create a complete product analysis pipeline for e-commerce applications.
|
| 96 |
+
|
| 97 |
+
Attributes:
|
| 98 |
+
device (str): Computing device ('cuda', 'mps', or 'cpu')
|
| 99 |
+
dtype (torch.dtype): Data type for model optimization
|
| 100 |
+
clip_model: CLIP model for zero-shot classification
|
| 101 |
+
clip_processor: CLIP processor for input preprocessing
|
| 102 |
+
blip_caption_model: BLIP model for image captioning
|
| 103 |
+
blip_caption_processor: BLIP processor for captioning
|
| 104 |
+
blip_vqa_model: BLIP model for visual question answering
|
| 105 |
+
blip_vqa_processor: BLIP processor for VQA
|
| 106 |
+
models_loaded (bool): Flag to track if models are loaded
|
| 107 |
+
"""
|
| 108 |
+
|
| 109 |
+
def __init__(self):
|
| 110 |
+
"""Initialize the SmartProductCataloger with device setup and model placeholders."""
|
| 111 |
+
# Automatically detect the best available device for AI computation
|
| 112 |
+
self.device, self.dtype = self.setup_device()
|
| 113 |
+
|
| 114 |
+
# Initialize model placeholders - models loaded separately for better UX
|
| 115 |
+
self.clip_model = None # CLIP classification model
|
| 116 |
+
self.clip_processor = None # CLIP input processor
|
| 117 |
+
self.blip_caption_model = None # BLIP captioning model
|
| 118 |
+
self.blip_caption_processor = None # BLIP captioning processor
|
| 119 |
+
self.blip_vqa_model = None # BLIP VQA model
|
| 120 |
+
self.blip_vqa_processor = None # BLIP VQA processor
|
| 121 |
+
self.models_loaded = False # Track model loading status
|
| 122 |
+
|
| 123 |
+
def setup_device(self):
|
| 124 |
+
"""
|
| 125 |
+
Setup the optimal computing device and data type for AI models.
|
| 126 |
+
|
| 127 |
+
Priority order: CUDA GPU > Apple Silicon MPS > CPU
|
| 128 |
+
Uses float16 for CUDA (memory efficiency) and float32 for others (stability).
|
| 129 |
+
|
| 130 |
+
Returns:
|
| 131 |
+
tuple: (device_name, torch_dtype) for model optimization
|
| 132 |
+
"""
|
| 133 |
+
if torch.cuda.is_available():
|
| 134 |
+
# NVIDIA GPU available - use CUDA with float16 for memory efficiency
|
| 135 |
+
return "cuda", torch.float16
|
| 136 |
+
elif torch.backends.mps.is_available():
|
| 137 |
+
# Apple Silicon Mac - use Metal Performance Shaders with float32
|
| 138 |
+
return "mps", torch.float32
|
| 139 |
+
else:
|
| 140 |
+
# Fallback to CPU with float32 for compatibility
|
| 141 |
+
return "cpu", torch.float32
|
| 142 |
+
|
| 143 |
+
def load_models(self):
|
| 144 |
+
"""
|
| 145 |
+
Load all required AI models for product analysis.
|
| 146 |
+
|
| 147 |
+
Downloads and initializes:
|
| 148 |
+
1. CLIP for zero-shot product classification
|
| 149 |
+
2. BLIP for image captioning and product descriptions
|
| 150 |
+
3. BLIP VQA for answering specific product attribute questions
|
| 151 |
+
|
| 152 |
+
Returns:
|
| 153 |
+
str: Status message indicating success or failure
|
| 154 |
+
"""
|
| 155 |
+
# Check if models are already loaded to avoid redundant loading
|
| 156 |
+
if self.models_loaded:
|
| 157 |
+
return "β
Models already loaded!"
|
| 158 |
+
|
| 159 |
+
try:
|
| 160 |
+
print("π¦ Loading models...")
|
| 161 |
+
|
| 162 |
+
# Load CLIP model for zero-shot product classification
|
| 163 |
+
# Model: openai/clip-vit-base-patch32 (versatile, well-trained model)
|
| 164 |
+
print("π¦ Loading CLIP model...")
|
| 165 |
+
self.clip_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
|
| 166 |
+
self.clip_processor = CLIPProcessor.from_pretrained(
|
| 167 |
+
"openai/clip-vit-base-patch32"
|
| 168 |
+
)
|
| 169 |
+
|
| 170 |
+
# Load BLIP model for image captioning and product descriptions
|
| 171 |
+
# Model: Salesforce/blip-image-captioning-base (specialized for descriptions)
|
| 172 |
+
print("π¦ Loading BLIP caption model...")
|
| 173 |
+
self.blip_caption_model = BlipForConditionalGeneration.from_pretrained(
|
| 174 |
+
"Salesforce/blip-image-captioning-base"
|
| 175 |
+
)
|
| 176 |
+
self.blip_caption_processor = BlipProcessor.from_pretrained(
|
| 177 |
+
"Salesforce/blip-image-captioning-base"
|
| 178 |
+
)
|
| 179 |
+
|
| 180 |
+
# Load BLIP VQA model for answering specific product questions
|
| 181 |
+
# Model: Salesforce/blip-vqa-base (specialized for visual Q&A)
|
| 182 |
+
print("π¦ Loading BLIP VQA model...")
|
| 183 |
+
self.blip_vqa_model = BlipForQuestionAnswering.from_pretrained(
|
| 184 |
+
"Salesforce/blip-vqa-base"
|
| 185 |
+
)
|
| 186 |
+
self.blip_vqa_processor = BlipProcessor.from_pretrained(
|
| 187 |
+
"Salesforce/blip-vqa-base"
|
| 188 |
+
)
|
| 189 |
+
|
| 190 |
+
# Set models to evaluation mode for inference (disables dropout, etc.)
|
| 191 |
+
self.blip_caption_model.eval()
|
| 192 |
+
self.blip_vqa_model.eval()
|
| 193 |
+
|
| 194 |
+
# Mark models as successfully loaded
|
| 195 |
+
self.models_loaded = True
|
| 196 |
+
return "β
All models loaded successfully!"
|
| 197 |
+
|
| 198 |
+
except Exception as e:
|
| 199 |
+
return f"β Error loading models: {str(e)}"
|
| 200 |
+
|
| 201 |
+
def load_image_from_url(self, url: str):
|
| 202 |
+
"""
|
| 203 |
+
Load an image from a URL with error handling.
|
| 204 |
+
|
| 205 |
+
Args:
|
| 206 |
+
url (str): URL of the image to load
|
| 207 |
+
|
| 208 |
+
Returns:
|
| 209 |
+
PIL.Image or None: Loaded image in RGB format, or None if failed
|
| 210 |
+
"""
|
| 211 |
+
try:
|
| 212 |
+
# Use requests to fetch image data with streaming for efficiency
|
| 213 |
+
response = requests.get(url, stream=True)
|
| 214 |
+
response.raise_for_status() # Raise exception for bad status codes
|
| 215 |
+
|
| 216 |
+
# Create PIL Image from response and ensure RGB format
|
| 217 |
+
image = Image.open(response.raw).convert("RGB")
|
| 218 |
+
return image
|
| 219 |
+
|
| 220 |
+
except Exception as e:
|
| 221 |
+
print(f"β Error loading image: {e}")
|
| 222 |
+
return None
|
| 223 |
+
|
| 224 |
+
def classify_product_image(self, image: Image.Image, candidate_labels: List[str]):
|
| 225 |
+
"""
|
| 226 |
+
Classify product image using CLIP zero-shot classification.
|
| 227 |
+
|
| 228 |
+
Args:
|
| 229 |
+
image (PIL.Image): Product image to classify
|
| 230 |
+
candidate_labels (List[str]): List of possible product categories
|
| 231 |
+
|
| 232 |
+
Returns:
|
| 233 |
+
List[Dict]: Classification results with labels and confidence scores
|
| 234 |
+
"""
|
| 235 |
+
if not self.models_loaded:
|
| 236 |
+
return [{"label": "error", "score": 0.0}]
|
| 237 |
+
|
| 238 |
+
try:
|
| 239 |
+
print("π Classifying product category...")
|
| 240 |
+
|
| 241 |
+
# Use our already-loaded CLIP model directly instead of pipeline
|
| 242 |
+
# Process image and text labels through CLIP processor
|
| 243 |
+
inputs = self.clip_processor(
|
| 244 |
+
text=candidate_labels, # List of category labels
|
| 245 |
+
images=image, # PIL Image
|
| 246 |
+
return_tensors="pt", # Return PyTorch tensors
|
| 247 |
+
padding=True, # Pad text inputs to same length
|
| 248 |
+
)
|
| 249 |
+
|
| 250 |
+
# Get predictions from CLIP model
|
| 251 |
+
with torch.no_grad(): # Disable gradients for inference
|
| 252 |
+
outputs = self.clip_model(**inputs)
|
| 253 |
+
|
| 254 |
+
# Calculate probabilities using softmax on logits
|
| 255 |
+
logits_per_image = (
|
| 256 |
+
outputs.logits_per_image
|
| 257 |
+
) # Image-text similarity scores
|
| 258 |
+
probs = torch.softmax(
|
| 259 |
+
logits_per_image, dim=-1
|
| 260 |
+
) # Convert to probabilities
|
| 261 |
+
|
| 262 |
+
# Format results to match pipeline output format
|
| 263 |
+
results = []
|
| 264 |
+
for i, label in enumerate(candidate_labels):
|
| 265 |
+
results.append(
|
| 266 |
+
{
|
| 267 |
+
"label": label,
|
| 268 |
+
"score": float(probs[0][i]), # Convert tensor to float
|
| 269 |
+
}
|
| 270 |
+
)
|
| 271 |
+
|
| 272 |
+
# Sort by confidence score (highest first) to match pipeline behavior
|
| 273 |
+
results.sort(key=lambda x: x["score"], reverse=True)
|
| 274 |
+
|
| 275 |
+
return results
|
| 276 |
+
|
| 277 |
+
except Exception as e:
|
| 278 |
+
print(f"β Classification error: {e}")
|
| 279 |
+
return [{"label": "error", "score": 0.0}]
|
| 280 |
+
|
| 281 |
+
def generate_product_caption(self, image: Image.Image):
|
| 282 |
+
"""
|
| 283 |
+
Generate descriptive caption for product image using BLIP.
|
| 284 |
+
|
| 285 |
+
Args:
|
| 286 |
+
image (PIL.Image): Product image to describe
|
| 287 |
+
|
| 288 |
+
Returns:
|
| 289 |
+
str: Generated product description
|
| 290 |
+
"""
|
| 291 |
+
if not self.models_loaded:
|
| 292 |
+
return "β Models not loaded."
|
| 293 |
+
|
| 294 |
+
try:
|
| 295 |
+
print("π Generating product description...")
|
| 296 |
+
|
| 297 |
+
# Process image through BLIP captioning processor
|
| 298 |
+
inputs = self.blip_caption_processor(image, return_tensors="pt")
|
| 299 |
+
|
| 300 |
+
# Generate caption using BLIP model with beam search for quality
|
| 301 |
+
with torch.no_grad(): # Disable gradients for inference efficiency
|
| 302 |
+
out = self.blip_caption_model.generate(
|
| 303 |
+
**inputs,
|
| 304 |
+
max_length=50, # Maximum description length
|
| 305 |
+
num_beams=5, # Beam search for better quality
|
| 306 |
+
early_stopping=True, # Stop when end token is generated
|
| 307 |
+
)
|
| 308 |
+
|
| 309 |
+
# Decode generated tokens back to readable text
|
| 310 |
+
caption = self.blip_caption_processor.decode(
|
| 311 |
+
out[0], skip_special_tokens=True
|
| 312 |
+
)
|
| 313 |
+
return caption
|
| 314 |
+
|
| 315 |
+
except Exception as e:
|
| 316 |
+
return f"β Error generating caption: {str(e)}"
|
| 317 |
+
|
| 318 |
+
def ask_about_product(self, image: Image.Image, question: str):
|
| 319 |
+
"""
|
| 320 |
+
Answer specific questions about product using BLIP Visual Question Answering.
|
| 321 |
+
|
| 322 |
+
Args:
|
| 323 |
+
image (PIL.Image): Product image to analyze
|
| 324 |
+
question (str): Question to ask about the product
|
| 325 |
+
|
| 326 |
+
Returns:
|
| 327 |
+
str: Answer to the question or error message
|
| 328 |
+
"""
|
| 329 |
+
if not self.models_loaded:
|
| 330 |
+
return "β Models not loaded."
|
| 331 |
+
|
| 332 |
+
try:
|
| 333 |
+
# Process both image and question together through BLIP VQA processor
|
| 334 |
+
inputs = self.blip_vqa_processor(image, question, return_tensors="pt")
|
| 335 |
+
|
| 336 |
+
# Generate answer using BLIP VQA model
|
| 337 |
+
with torch.no_grad(): # Disable gradients for inference
|
| 338 |
+
out = self.blip_vqa_model.generate(
|
| 339 |
+
**inputs,
|
| 340 |
+
max_length=20, # Answers are typically short
|
| 341 |
+
num_beams=5, # Beam search for better quality
|
| 342 |
+
early_stopping=True, # Stop when end token is generated
|
| 343 |
+
)
|
| 344 |
+
|
| 345 |
+
# Decode generated tokens to get the final answer
|
| 346 |
+
answer = self.blip_vqa_processor.decode(out[0], skip_special_tokens=True)
|
| 347 |
+
return answer.strip() # Remove extra whitespace
|
| 348 |
+
|
| 349 |
+
except Exception as e:
|
| 350 |
+
return f"β Error: {str(e)}"
|
| 351 |
+
|
| 352 |
+
def get_category_questions(self, category: str):
|
| 353 |
+
"""
|
| 354 |
+
Get relevant questions for specific product categories.
|
| 355 |
+
|
| 356 |
+
Each category has tailored questions to extract the most useful
|
| 357 |
+
e-commerce metadata and product attributes.
|
| 358 |
+
|
| 359 |
+
Args:
|
| 360 |
+
category (str): Product category name
|
| 361 |
+
|
| 362 |
+
Returns:
|
| 363 |
+
List[str]: List of relevant questions for the category
|
| 364 |
+
"""
|
| 365 |
+
# Comprehensive mapping of categories to relevant e-commerce questions
|
| 366 |
+
question_map = {
|
| 367 |
+
"shoes": [
|
| 368 |
+
"What color are these shoes?",
|
| 369 |
+
"What type of shoes are these?",
|
| 370 |
+
"What brand are these shoes?",
|
| 371 |
+
"What material are these shoes made of?",
|
| 372 |
+
"Are these sneakers?",
|
| 373 |
+
],
|
| 374 |
+
"clothing": [
|
| 375 |
+
"What color is this clothing?",
|
| 376 |
+
"What type of clothing is this?",
|
| 377 |
+
"What material is this clothing made of?",
|
| 378 |
+
"What size is this clothing?",
|
| 379 |
+
"Is this formal or casual wear?",
|
| 380 |
+
],
|
| 381 |
+
"electronics": [
|
| 382 |
+
"What type of device is this?",
|
| 383 |
+
"What brand is this device?",
|
| 384 |
+
"What color is this device?",
|
| 385 |
+
"Is this a smartphone or tablet?",
|
| 386 |
+
"Does this have a screen?",
|
| 387 |
+
],
|
| 388 |
+
"furniture": [
|
| 389 |
+
"What type of furniture is this?",
|
| 390 |
+
"What color is this furniture?",
|
| 391 |
+
"What material is this furniture made of?",
|
| 392 |
+
"Is this indoor or outdoor furniture?",
|
| 393 |
+
"How many people can use this?",
|
| 394 |
+
],
|
| 395 |
+
"books": [
|
| 396 |
+
"What type of book is this?",
|
| 397 |
+
"What color is the book cover?",
|
| 398 |
+
"Is this a hardcover or paperback?",
|
| 399 |
+
"Does this book have text on the cover?",
|
| 400 |
+
"Is this a fiction or non-fiction book?",
|
| 401 |
+
],
|
| 402 |
+
"toys": [
|
| 403 |
+
"What type of toy is this?",
|
| 404 |
+
"What color is this toy?",
|
| 405 |
+
"Is this toy for children or adults?",
|
| 406 |
+
"What material is this toy made of?",
|
| 407 |
+
"Is this an educational toy?",
|
| 408 |
+
],
|
| 409 |
+
}
|
| 410 |
+
|
| 411 |
+
# Return category-specific questions or default generic questions
|
| 412 |
+
return question_map.get(
|
| 413 |
+
category,
|
| 414 |
+
[
|
| 415 |
+
"What color is this?",
|
| 416 |
+
"What type of item is this?",
|
| 417 |
+
"What is this made of?",
|
| 418 |
+
],
|
| 419 |
+
)
|
| 420 |
+
|
| 421 |
+
def analyze_product_complete(self, image: Image.Image):
|
| 422 |
+
"""
|
| 423 |
+
Complete end-to-end product analysis pipeline.
|
| 424 |
+
|
| 425 |
+
This method combines all analysis steps:
|
| 426 |
+
1. Classify product category using CLIP
|
| 427 |
+
2. Generate product description using BLIP
|
| 428 |
+
3. Ask category-specific questions using BLIP VQA
|
| 429 |
+
4. Compile results into structured e-commerce metadata
|
| 430 |
+
|
| 431 |
+
Args:
|
| 432 |
+
image (PIL.Image): Product image to analyze
|
| 433 |
+
|
| 434 |
+
Returns:
|
| 435 |
+
Dict: Complete analysis results with category, description, and attributes
|
| 436 |
+
"""
|
| 437 |
+
if not self.models_loaded:
|
| 438 |
+
return {"error": "Models not loaded", "status": "failed"}
|
| 439 |
+
|
| 440 |
+
if image is None:
|
| 441 |
+
return {"error": "No image provided", "status": "failed"}
|
| 442 |
+
|
| 443 |
+
try:
|
| 444 |
+
print("π Starting complete product analysis...")
|
| 445 |
+
|
| 446 |
+
# Step 1: Classify product category using CLIP zero-shot classification
|
| 447 |
+
product_categories = [
|
| 448 |
+
"clothing",
|
| 449 |
+
"shoes",
|
| 450 |
+
"electronics",
|
| 451 |
+
"furniture",
|
| 452 |
+
"books",
|
| 453 |
+
"toys",
|
| 454 |
+
]
|
| 455 |
+
classification_results = self.classify_product_image(
|
| 456 |
+
image, product_categories
|
| 457 |
+
)
|
| 458 |
+
|
| 459 |
+
if classification_results[0]["label"] == "error":
|
| 460 |
+
return {"error": "Classification failed", "status": "failed"}
|
| 461 |
+
|
| 462 |
+
top_category = classification_results[0] # Highest confidence category
|
| 463 |
+
|
| 464 |
+
# Step 2: Generate product description using BLIP captioning
|
| 465 |
+
description = self.generate_product_caption(image)
|
| 466 |
+
|
| 467 |
+
# Step 3: Get category-specific questions and ask them using VQA
|
| 468 |
+
category = top_category["label"]
|
| 469 |
+
questions = self.get_category_questions(category)
|
| 470 |
+
|
| 471 |
+
# Ask each question and collect answers for product attributes
|
| 472 |
+
qa_results = {}
|
| 473 |
+
for question in questions:
|
| 474 |
+
answer = self.ask_about_product(image, question)
|
| 475 |
+
qa_results[question] = answer
|
| 476 |
+
|
| 477 |
+
# Step 4: Compile everything into structured e-commerce metadata
|
| 478 |
+
result = {
|
| 479 |
+
"category": {"name": category, "confidence": top_category["score"]},
|
| 480 |
+
"description": description,
|
| 481 |
+
"attributes": qa_results,
|
| 482 |
+
"all_categories": classification_results, # Include all classification results
|
| 483 |
+
"status": "success",
|
| 484 |
+
}
|
| 485 |
+
|
| 486 |
+
return result
|
| 487 |
+
|
| 488 |
+
except Exception as e:
|
| 489 |
+
return {"error": str(e), "status": "failed"}
|
| 490 |
+
|
| 491 |
+
|
| 492 |
+
# Initialize the main product cataloger instance
|
| 493 |
+
# This creates a single instance used throughout the app
|
| 494 |
+
product_cataloger = SmartProductCataloger()
|
| 495 |
+
|
| 496 |
+
# Define Gradio interface wrapper functions
|
| 497 |
+
# These functions adapt the class methods for use with Gradio components
|
| 498 |
+
|
| 499 |
+
|
| 500 |
+
def load_models_interface():
|
| 501 |
+
"""
|
| 502 |
+
Gradio interface wrapper for loading AI models.
|
| 503 |
+
|
| 504 |
+
Returns:
|
| 505 |
+
str: Status message from model loading process
|
| 506 |
+
"""
|
| 507 |
+
return product_cataloger.load_models()
|
| 508 |
+
|
| 509 |
+
|
| 510 |
+
def analyze_upload_interface(image):
|
| 511 |
+
"""
|
| 512 |
+
Gradio interface wrapper for analyzing directly uploaded product images.
|
| 513 |
+
|
| 514 |
+
Args:
|
| 515 |
+
image (PIL.Image or None): Image uploaded through Gradio interface
|
| 516 |
+
|
| 517 |
+
Returns:
|
| 518 |
+
tuple: (image, analysis_text, category_text, attributes_text) for Gradio outputs
|
| 519 |
+
"""
|
| 520 |
+
# Validate image input from Gradio component
|
| 521 |
+
if image is None:
|
| 522 |
+
error_msg = "β Please upload a product image."
|
| 523 |
+
return None, error_msg, error_msg, error_msg
|
| 524 |
+
|
| 525 |
+
# Run complete analysis pipeline on the uploaded image
|
| 526 |
+
result = product_cataloger.analyze_product_complete(image)
|
| 527 |
+
|
| 528 |
+
if result.get("status") == "failed":
|
| 529 |
+
error_msg = f"β Analysis failed: {result.get('error', 'Unknown error')}"
|
| 530 |
+
return image, error_msg, error_msg, error_msg
|
| 531 |
+
|
| 532 |
+
# Format results for display in Gradio interface
|
| 533 |
+
# Main analysis summary
|
| 534 |
+
analysis_text = f"""π PRODUCT ANALYSIS COMPLETE
|
| 535 |
+
|
| 536 |
+
π Description: {result['description']}
|
| 537 |
+
|
| 538 |
+
π·οΈ Category: {result['category']['name']} (confidence: {result['category']['confidence']:.3f})
|
| 539 |
+
|
| 540 |
+
β
Analysis Status: {result['status']}"""
|
| 541 |
+
|
| 542 |
+
# Category classification results
|
| 543 |
+
category_text = "π·οΈ CATEGORY CLASSIFICATION\n\n"
|
| 544 |
+
for cat in result["all_categories"]:
|
| 545 |
+
category_text += f"β’ {cat['label']}: {cat['score']:.3f}\n"
|
| 546 |
+
|
| 547 |
+
# Product attributes from VQA
|
| 548 |
+
attributes_text = "π PRODUCT ATTRIBUTES\n\n"
|
| 549 |
+
for question, answer in result["attributes"].items():
|
| 550 |
+
attributes_text += f"β {question}\nπ‘ {answer}\n\n"
|
| 551 |
+
|
| 552 |
+
# Return the same image for display along with analysis results
|
| 553 |
+
return image, analysis_text, category_text, attributes_text
|
| 554 |
+
|
| 555 |
+
|
| 556 |
+
def analyze_url_interface(url):
|
| 557 |
+
"""
|
| 558 |
+
Gradio interface wrapper for analyzing product from URL.
|
| 559 |
+
|
| 560 |
+
Args:
|
| 561 |
+
url (str): Image URL from Gradio textbox
|
| 562 |
+
|
| 563 |
+
Returns:
|
| 564 |
+
tuple: (image, analysis_text, category_text, attributes_text) for Gradio outputs
|
| 565 |
+
"""
|
| 566 |
+
# Validate URL input
|
| 567 |
+
if not url or not url.strip():
|
| 568 |
+
error_msg = "β Please provide an image URL."
|
| 569 |
+
return None, error_msg, error_msg, error_msg
|
| 570 |
+
|
| 571 |
+
# Load image from URL
|
| 572 |
+
image = product_cataloger.load_image_from_url(url.strip())
|
| 573 |
+
if image is None:
|
| 574 |
+
error_msg = "β Failed to load image from URL. Please check the URL."
|
| 575 |
+
return None, error_msg, error_msg, error_msg
|
| 576 |
+
|
| 577 |
+
# Run complete analysis pipeline on the loaded image
|
| 578 |
+
result = product_cataloger.analyze_product_complete(image)
|
| 579 |
+
|
| 580 |
+
if result.get("status") == "failed":
|
| 581 |
+
error_msg = f"β Analysis failed: {result.get('error', 'Unknown error')}"
|
| 582 |
+
return image, error_msg, error_msg, error_msg
|
| 583 |
+
|
| 584 |
+
# Format results for display in Gradio interface
|
| 585 |
+
# Main analysis summary
|
| 586 |
+
analysis_text = f"""π PRODUCT ANALYSIS COMPLETE
|
| 587 |
+
|
| 588 |
+
π Description: {result['description']}
|
| 589 |
+
|
| 590 |
+
π·οΈ Category: {result['category']['name']} (confidence: {result['category']['confidence']:.3f})
|
| 591 |
+
|
| 592 |
+
β
Analysis Status: {result['status']}"""
|
| 593 |
+
|
| 594 |
+
# Category classification results
|
| 595 |
+
category_text = "π·οΈ CATEGORY CLASSIFICATION\n\n"
|
| 596 |
+
for cat in result["all_categories"]:
|
| 597 |
+
category_text += f"β’ {cat['label']}: {cat['score']:.3f}\n"
|
| 598 |
+
|
| 599 |
+
# Product attributes from VQA
|
| 600 |
+
attributes_text = "π PRODUCT ATTRIBUTES\n\n"
|
| 601 |
+
for question, answer in result["attributes"].items():
|
| 602 |
+
attributes_text += f"β {question}\nπ‘ {answer}\n\n"
|
| 603 |
+
|
| 604 |
+
return image, analysis_text, category_text, attributes_text
|
| 605 |
+
|
| 606 |
+
|
| 607 |
+
# Create Gradio interface using Blocks for custom layout
|
| 608 |
+
# gr.Blocks: Allows custom layout with rows, columns, and advanced components
|
| 609 |
+
# title: Sets the browser tab title for the web interface
|
| 610 |
+
with gr.Blocks(title="Smart Product Cataloger") as app:
|
| 611 |
+
# gr.Markdown: Renders markdown text with formatting, emojis, and styling
|
| 612 |
+
# Supports HTML-like formatting for headers, lists, bold text, etc.
|
| 613 |
+
gr.Markdown(
|
| 614 |
+
"""
|
| 615 |
+
# ποΈ Smart Product Cataloger
|
| 616 |
+
|
| 617 |
+
**Multimodal AI for E-commerce Product Analysis**
|
| 618 |
+
|
| 619 |
+
This app analyzes product images and generates metadata for e-commerce listings
|
| 620 |
+
using CLIP for classification and BLIP for captioning and visual question answering.
|
| 621 |
+
|
| 622 |
+
## π How to use:
|
| 623 |
+
1. **Load Models** - Click to load the AI models (required first step)
|
| 624 |
+
2. **Upload Image** - Upload a product image directly for analysis
|
| 625 |
+
3. **URL Analysis** - Analyze products from image URLs
|
| 626 |
+
"""
|
| 627 |
+
)
|
| 628 |
+
|
| 629 |
+
# Model loading section
|
| 630 |
+
# gr.Row: Creates horizontal layout container for organizing components side by side
|
| 631 |
+
with gr.Row():
|
| 632 |
+
# gr.Column: Creates vertical layout container within the row
|
| 633 |
+
with gr.Column():
|
| 634 |
+
# Markdown for section header with emoji and formatting
|
| 635 |
+
gr.Markdown("### π¦ Step 1: Load Models")
|
| 636 |
+
|
| 637 |
+
# gr.Button: Interactive button component
|
| 638 |
+
# variant="primary": Makes button blue/prominent (primary action)
|
| 639 |
+
# size="lg": Large button size for better visibility
|
| 640 |
+
load_btn = gr.Button("π Load Models", variant="primary", size="lg")
|
| 641 |
+
|
| 642 |
+
# gr.Textbox: Text input/output component
|
| 643 |
+
# label: Display label above the textbox
|
| 644 |
+
# interactive=False: Makes textbox read-only (output only)
|
| 645 |
+
load_status = gr.Textbox(label="Status", interactive=False)
|
| 646 |
+
|
| 647 |
+
# Event handler: Connects button click to function
|
| 648 |
+
# fn: Function to call when button is clicked
|
| 649 |
+
# outputs: Which component(s) receive the function's return value
|
| 650 |
+
load_btn.click(
|
| 651 |
+
fn=load_models_interface, # Function to execute
|
| 652 |
+
outputs=load_status, # Component to update with result
|
| 653 |
+
)
|
| 654 |
+
|
| 655 |
+
# Markdown horizontal rule for visual separation between sections
|
| 656 |
+
gr.Markdown("---")
|
| 657 |
+
|
| 658 |
+
# Direct image upload section
|
| 659 |
+
with gr.Row():
|
| 660 |
+
# Left column for image upload and controls
|
| 661 |
+
# scale=1: Equal width columns (both columns take same space)
|
| 662 |
+
with gr.Column(scale=1):
|
| 663 |
+
gr.Markdown("### πΈ Step 2: Upload Product Image")
|
| 664 |
+
|
| 665 |
+
# gr.Image for file upload functionality
|
| 666 |
+
# When no image is provided, shows upload interface
|
| 667 |
+
# label: Text shown above upload area
|
| 668 |
+
# height: Fixed pixel height for consistent layout
|
| 669 |
+
uploaded_image = gr.Image(label="Upload Product Image", height=400)
|
| 670 |
+
|
| 671 |
+
# Primary action button for direct image analysis
|
| 672 |
+
# variant="primary": Blue/prominent styling for main action
|
| 673 |
+
upload_analyze_btn = gr.Button(
|
| 674 |
+
"π Analyze Uploaded Image", variant="primary"
|
| 675 |
+
)
|
| 676 |
+
|
| 677 |
+
# Right column for displaying the uploaded image
|
| 678 |
+
with gr.Column(scale=1):
|
| 679 |
+
# gr.Image: Component for displaying the uploaded image
|
| 680 |
+
# label: Caption shown above image
|
| 681 |
+
# height: Consistent sizing with upload area
|
| 682 |
+
upload_image_display = gr.Image(label="Uploaded Image", height=400)
|
| 683 |
+
|
| 684 |
+
# Upload analysis results section with three columns for different result types
|
| 685 |
+
with gr.Row():
|
| 686 |
+
# Column for main analysis summary
|
| 687 |
+
with gr.Column(scale=1):
|
| 688 |
+
# Multi-line textbox for displaying main analysis results
|
| 689 |
+
# lines=8: Adequate height for analysis summary
|
| 690 |
+
# interactive=False: Read-only output field
|
| 691 |
+
upload_analysis_output = gr.Textbox(
|
| 692 |
+
label="π Analysis Summary", lines=8, interactive=False
|
| 693 |
+
)
|
| 694 |
+
|
| 695 |
+
# Column for category classification results
|
| 696 |
+
with gr.Column(scale=1):
|
| 697 |
+
# Output textbox for category classification scores
|
| 698 |
+
upload_category_output = gr.Textbox(
|
| 699 |
+
label="π·οΈ Category Classification", lines=8, interactive=False
|
| 700 |
+
)
|
| 701 |
+
|
| 702 |
+
# Column for product attributes from VQA
|
| 703 |
+
with gr.Column(scale=1):
|
| 704 |
+
# Output textbox for detailed product attributes
|
| 705 |
+
upload_attributes_output = gr.Textbox(
|
| 706 |
+
label="π Product Attributes", lines=8, interactive=False
|
| 707 |
+
)
|
| 708 |
+
|
| 709 |
+
# Event handler for upload analyze button
|
| 710 |
+
# inputs: Component whose value is passed to function
|
| 711 |
+
# outputs: Components that receive function return values (order matters)
|
| 712 |
+
upload_analyze_btn.click(
|
| 713 |
+
fn=analyze_upload_interface, # Function to call
|
| 714 |
+
inputs=uploaded_image, # Input component
|
| 715 |
+
outputs=[
|
| 716 |
+
upload_image_display,
|
| 717 |
+
upload_analysis_output,
|
| 718 |
+
upload_category_output,
|
| 719 |
+
upload_attributes_output,
|
| 720 |
+
], # Output components
|
| 721 |
+
)
|
| 722 |
+
|
| 723 |
+
# Visual separator between sections
|
| 724 |
+
gr.Markdown("---")
|
| 725 |
+
|
| 726 |
+
# URL analysis section for analyzing products from web URLs
|
| 727 |
+
with gr.Row():
|
| 728 |
+
# Left column for URL input
|
| 729 |
+
with gr.Column(scale=1):
|
| 730 |
+
gr.Markdown("### π Step 3: Analyze from URL")
|
| 731 |
+
|
| 732 |
+
# gr.Textbox for URL input
|
| 733 |
+
# label: Text shown above input field
|
| 734 |
+
# placeholder: Hint text shown when field is empty
|
| 735 |
+
# lines=1: Single line input for URLs
|
| 736 |
+
url_input = gr.Textbox(
|
| 737 |
+
label="Product Image URL",
|
| 738 |
+
placeholder="https://example.com/product-image.jpg",
|
| 739 |
+
lines=1,
|
| 740 |
+
)
|
| 741 |
+
|
| 742 |
+
# Secondary action button for URL analysis
|
| 743 |
+
# variant="secondary": Gray/muted styling (less prominent than primary)
|
| 744 |
+
url_analyze_btn = gr.Button("π Analyze from URL", variant="secondary")
|
| 745 |
+
|
| 746 |
+
# Right column for URL-loaded image display
|
| 747 |
+
with gr.Column(scale=1):
|
| 748 |
+
# Image component to show the loaded image from URL
|
| 749 |
+
url_image_display = gr.Image(label="Loaded Image", height=400)
|
| 750 |
+
|
| 751 |
+
# URL analysis results section with three columns for different result types
|
| 752 |
+
with gr.Row():
|
| 753 |
+
# Three columns for different types of analysis results
|
| 754 |
+
with gr.Column(scale=1):
|
| 755 |
+
# Main analysis results for URL-loaded image
|
| 756 |
+
url_analysis_output = gr.Textbox(
|
| 757 |
+
label="π Analysis Summary", lines=8, interactive=False
|
| 758 |
+
)
|
| 759 |
+
|
| 760 |
+
with gr.Column(scale=1):
|
| 761 |
+
# Category classification for URL-loaded image
|
| 762 |
+
url_category_output = gr.Textbox(
|
| 763 |
+
label="π·οΈ Category Classification", lines=8, interactive=False
|
| 764 |
+
)
|
| 765 |
+
|
| 766 |
+
with gr.Column(scale=1):
|
| 767 |
+
# Product attributes for URL-loaded image
|
| 768 |
+
url_attributes_output = gr.Textbox(
|
| 769 |
+
label="π Product Attributes", lines=8, interactive=False
|
| 770 |
+
)
|
| 771 |
+
|
| 772 |
+
# Event handler for URL analysis button
|
| 773 |
+
# inputs: URL textbox component
|
| 774 |
+
# outputs: All four components (image + three analysis results)
|
| 775 |
+
url_analyze_btn.click(
|
| 776 |
+
fn=analyze_url_interface, # Function to execute
|
| 777 |
+
inputs=url_input, # Input component
|
| 778 |
+
outputs=[
|
| 779 |
+
url_image_display,
|
| 780 |
+
url_analysis_output,
|
| 781 |
+
url_category_output,
|
| 782 |
+
url_attributes_output,
|
| 783 |
+
], # Output components
|
| 784 |
+
)
|
| 785 |
+
|
| 786 |
+
# Final section with examples and usage tips
|
| 787 |
+
# Triple-quoted string allows multi-line markdown content
|
| 788 |
+
gr.Markdown(
|
| 789 |
+
"""
|
| 790 |
+
---
|
| 791 |
+
### π Example Product Categories:
|
| 792 |
+
- **Clothing**: shirts, dresses, pants, jackets
|
| 793 |
+
- **Shoes**: sneakers, boots, dress shoes, sandals
|
| 794 |
+
- **Electronics**: phones, laptops, headphones, tablets
|
| 795 |
+
- **Furniture**: chairs, tables, sofas, desks
|
| 796 |
+
- **Books**: novels, textbooks, magazines, comics
|
| 797 |
+
- **Toys**: games, dolls, educational toys, puzzles
|
| 798 |
+
|
| 799 |
+
### π Sample Product URLs:
|
| 800 |
+
- Shoes: https://images.unsplash.com/photo-1542291026-7eec264c27ff
|
| 801 |
+
- Electronics: https://images.unsplash.com/photo-1511707171634-5f897ff02aa9
|
| 802 |
+
- Clothing: https://images.unsplash.com/photo-1521572163474-6864f9cf17ab
|
| 803 |
+
|
| 804 |
+
"""
|
| 805 |
+
)
|
| 806 |
+
|
| 807 |
+
if __name__ == "__main__":
|
| 808 |
+
"""
|
| 809 |
+
Launch the Gradio app when script is run directly.
|
| 810 |
+
|
| 811 |
+
Configuration:
|
| 812 |
+
server_name="0.0.0.0": Allow access from any IP address
|
| 813 |
+
server_port=7860: Use port 7860 (Gradio default)
|
| 814 |
+
share=True: Create public Gradio link for sharing
|
| 815 |
+
debug=True: Enable debug mode for development
|
| 816 |
+
"""
|
| 817 |
+
app.launch(
|
| 818 |
+
server_name="0.0.0.0", # Listen on all network interfaces
|
| 819 |
+
server_port=7860, # Standard Gradio port
|
| 820 |
+
share=True, # Generate shareable public link
|
| 821 |
+
debug=True, # Enable debug logging
|
| 822 |
+
)
|