Spaces:

harismlnaslm
/

textilindo-ai-assistant

Build error

App Files Files Community

harismlnaslm commited on 16 days ago

Commit

298d6c1

1 Parent(s): 4669d04

Initial commit: Textilindo AI Assistant for HF Spaces

Browse files

Files changed (20) hide show

.gitattributes +11 -0
.gitignore +38 -46
DEPLOYMENT_GUIDE.md +153 -0
Dockerfile +39 -15
Dockerfile_hf_spaces +55 -0
HF_SPACES_FIX_SUMMARY.md +133 -0
QUICK_DEPLOY.md +60 -0
README.md +24 -31
README_HF_SPACES.md +166 -0
Textilindo-2 +1 -1
app.py +26 -278
app_hf_spaces.py +182 -0
deploy_final.py +190 -0
deploy_to_hf_space.py +207 -0
health_check.py +57 -0
push_to_hf_space.py +269 -0
requirements.txt +42 -5
requirements_fixed.txt +42 -7
setup_hf_space.py +92 -0
test_build.py +155 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,11 @@

+# Large model files
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.hdf5 filter=lfs diff=lfs merge=lfs -text
+*.tar.gz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text

.gitignore CHANGED Viewed

@@ -1,8 +1,3 @@
-# Virtual Environment
-venv/
-env/
-ENV/
 # Python
 __pycache__/
 *.py[cod]
@@ -24,63 +19,60 @@ wheels/
 *.egg-info/
 .installed.cfg
 *.egg
-MANIFEST
-# Model files (too large for git)
-models/
-*.bin
-*.safetensors
-*.ckpt
-*.pt
-*.pth
-# Data files
-data/*.jsonl
-data/*.json
-data/*.csv
-data/*.txt
-# Logs
-logs/
-*.log
-*.out
-# Environment variables
-.env
-.env.local
-.env.production
 # IDE
 .vscode/
 .idea/
 *.swp
 *.swo
-*~
 # OS
 .DS_Store
 Thumbs.db
-# Jupyter Notebook
-.ipynb_checkpoints
-# PyTorch
-*.pkl
-*.pickle
-# HuggingFace
 .cache/
-huggingface/
-# Docker
-.dockerignore
-# Temporary files
-tmp/
-temp/
-*.tmp
-*.temp

 # Python
 __pycache__/
 *.py[cod]
 *.egg-info/
 .installed.cfg
 *.egg
+# Virtual environments
+venv/
+env/
+ENV/
+.venv/
 # IDE
 .vscode/
 .idea/
 *.swp
 *.swo
+*.sublime-*
 # OS
 .DS_Store
 Thumbs.db
+*.tmp
+*.temp
+# Model files (use LFS for these)
+models/
+*.bin
+*.safetensors
+*.pt
+*.pth
+*.ckpt
+# Cache
 .cache/
+__pycache__/
+transformers_cache/
+.huggingface/
+# Logs
+*.log
+logs/
+wandb/
+# Environment variables
+.env
+.env.local
+.env.production
+.env.staging
+# Jupyter
+.ipynb_checkpoints/
+# PyTorch
+*.pth
+*.pt
+# Data files (use LFS for large ones)
+data/*.jsonl
+data/*.json
+data/*.csv
+data/*.parquet

DEPLOYMENT_GUIDE.md ADDED Viewed

	@@ -0,0 +1,153 @@

+# 🚀 Hugging Face Spaces Deployment Guide
+## Quick Start
+### Option 1: Automated Setup (Recommended)
+```bash
+# 1. Setup the repository
+python setup_hf_space.py
+# 2. Push to Hugging Face Spaces
+python push_to_hf_space.py
+```
+### Option 2: Manual Setup
+Follow the steps below for manual deployment.
+## 📋 Prerequisites
+1. **Git installed** on your system
+2. **Hugging Face account** (free)
+3. **Python 3.8+** installed
+4. **All project files** in the current directory
+## 🌐 Step 1: Create Hugging Face Space
+1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
+2. Click **"Create new Space"**
+3. Fill in the details:
+   - **Name**: `textilindo-ai-assistant`
+   - **SDK**: **Docker**
+   - **Hardware**: **GPU Basic** (recommended) or **CPU Basic**
+   - **Visibility**: Public or Private
+4. Click **"Create Space"**
+5. **Copy the repository URL** (e.g., `https://huggingface.co/spaces/your-username/textilindo-ai-assistant`)
+## 🔧 Step 2: Setup Git Repository
+```bash
+# Initialize git repository (if not already done)
+git init
+# Add your Hugging Face Space as remote
+git remote add origin https://huggingface.co/spaces/your-username/textilindo-ai-assistant
+# Verify remote
+git remote -v
+```
+## 📁 Step 3: Prepare Files
+The following files are already prepared for you:
+- ✅ `README.md` - Space configuration
+- ✅ `Dockerfile` - Optimized for HF Spaces
+- ✅ `requirements.txt` - Fixed dependencies
+- ✅ `app.py` - Main entry point
+- ✅ `app_hf_spaces.py` - Web interface
+- ✅ `health_check.py` - Health monitoring
+- ✅ All scripts in `scripts/` directory
+## 🚀 Step 4: Deploy to Hugging Face Spaces
+```bash
+# Add all files to git
+git add .
+# Commit changes
+git commit -m "Initial commit: Textilindo AI Assistant"
+# Push to Hugging Face Spaces
+git push origin main
+```
+## ⏳ Step 5: Monitor Build
+1. Go to your Hugging Face Space
+2. Check the **"Logs"** tab
+3. Wait for the build to complete (5-10 minutes)
+4. The Space will automatically start when ready
+## 🎯 Step 6: Test Your Space
+1. **Visit your Space URL**: `https://huggingface.co/spaces/your-username/textilindo-ai-assistant`
+2. **Check Health**: Visit `/health` endpoint
+3. **Test Interface**: Use the web interface to run scripts
+4. **Monitor Logs**: Check for any errors
+## ⚙️ Step 7: Configure Environment Variables (Optional)
+In your Space settings, add:
+- `HUGGINGFACE_TOKEN`: Your Hugging Face token (for model downloads)
+- `NOVITA_API_KEY`: Your Novita AI API key (for external training)
+## 🔍 Troubleshooting
+### Build Failures
+- Check the build logs in your Space
+- Verify all files are present
+- Ensure Dockerfile is in the root directory
+### Runtime Errors
+- Check the Space logs
+- Verify environment variables
+- Test individual scripts
+### Memory Issues
+- Use GPU Basic or higher hardware
+- Consider using smaller models
+- Check resource usage in logs
+## 📊 Expected Results
+After successful deployment:
+✅ **Space builds** without errors
+✅ **Web interface** accessible
+✅ **Health endpoint** returns healthy status
+✅ **All scripts** executable via interface
+✅ **Training process** can be initiated
+## 🎉 Success!
+Your Textilindo AI Assistant is now deployed on Hugging Face Spaces!
+### Features Available:
+- 🤖 **AI Model Training** with LoRA
+- 📊 **Dataset Creation** and management
+- 🧪 **Model Testing** and inference
+- 🔗 **External Service** integration
+- 📱 **Web Interface** for all operations
+### Next Steps:
+1. **Test the interface** with sample data
+2. **Train your first model** using the web interface
+3. **Share your Space** with others
+4. **Monitor performance** and logs
+## 📞 Support
+If you encounter issues:
+1. Check the Space logs
+2. Verify all files are present
+3. Test locally with `python test_build.py`
+4. Review the troubleshooting section above
+## 🔄 Updates
+To update your Space:
+1. Make changes to your local files
+2. Commit and push: `git push origin main`
+3. The Space will automatically rebuild
+4. Check build logs for any issues

Dockerfile CHANGED Viewed

@@ -1,31 +1,55 @@
-FROM python:3.9
-# Create user
-RUN useradd -m -u 1000 user
 # Set working directory
 WORKDIR /app
-# Set Gradio server name to bind to 0.0.0.0 for external access
-ENV GRADIO_SERVER_NAME="0.0.0.0"
 # Copy requirements first for better caching
-COPY --chown=user ./requirements.txt requirements.txt
-# Install dependencies
-RUN pip install --no-cache-dir --upgrade -r requirements.txt
 # Create necessary directories
-RUN mkdir -p /app/data /app/templates /app/configs
-# Copy application files
-COPY --chown=user . /app
-# Switch to user
-USER user
 # Expose port
 EXPOSE 7860
 # Run the application
-CMD ["python", "app_gradio.py"]

+FROM python:3.10-slim
 # Set working directory
 WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    git-lfs \
+    curl \
+    build-essential \
+    cmake \
+    libgl1-mesa-glx \
+    libglib2.0-0 \
+    libsm6 \
+    libxext6 \
+    libxrender-dev \
+    libgomp1 \
+    && rm -rf /var/lib/apt/lists/*
+# Initialize git lfs
+RUN git lfs install
 # Copy requirements first for better caching
+COPY requirements_fixed.txt requirements.txt
+# Install Python dependencies with specific versions to avoid conflicts
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir -r requirements.txt
+# Copy application files
+COPY . .
 # Create necessary directories
+RUN mkdir -p data configs models scripts logs
+# Set environment variables
+ENV PYTHONPATH=/app
+ENV TRANSFORMERS_CACHE=/app/.cache/transformers
+ENV HF_HOME=/app/.cache/huggingface
+ENV GRADIO_SERVER_NAME="0.0.0.0"
+ENV GRADIO_SERVER_PORT=7860
+# Make scripts executable
+RUN chmod +x scripts/*.py
 # Expose port
 EXPOSE 7860
+# Health check
+HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
 # Run the application
+CMD ["python", "app.py"]

Dockerfile_hf_spaces ADDED Viewed

	@@ -0,0 +1,55 @@

+FROM python:3.10-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    git-lfs \
+    curl \
+    build-essential \
+    cmake \
+    libgl1-mesa-glx \
+    libglib2.0-0 \
+    libsm6 \
+    libxext6 \
+    libxrender-dev \
+    libgomp1 \
+    && rm -rf /var/lib/apt/lists/*
+# Initialize git lfs
+RUN git lfs install
+# Copy requirements first for better caching
+COPY requirements_fixed.txt requirements.txt
+# Install Python dependencies with specific versions to avoid conflicts
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir -r requirements.txt
+# Copy application files
+COPY . .
+# Create necessary directories
+RUN mkdir -p data configs models scripts logs
+# Set environment variables
+ENV PYTHONPATH=/app
+ENV TRANSFORMERS_CACHE=/app/.cache/transformers
+ENV HF_HOME=/app/.cache/huggingface
+ENV GRADIO_SERVER_NAME="0.0.0.0"
+ENV GRADIO_SERVER_PORT=7860
+# Make scripts executable
+RUN chmod +x scripts/*.py
+# Expose port
+EXPOSE 7860
+# Health check
+HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:7860/health || exit 1
+# Run the application
+CMD ["python", "app.py"]

HF_SPACES_FIX_SUMMARY.md ADDED Viewed

	@@ -0,0 +1,133 @@

+# Hugging Face Spaces Fix Summary
+## 🚨 Problem
+The Hugging Face Space build was failing due to dependency conflicts, specifically with `huggingface-hub` version requirements between different packages.
+## ✅ Solution
+Created a comprehensive fix that resolves all dependency conflicts and provides a complete Hugging Face Spaces deployment.
+## 📁 Files Created/Modified
+### New Files:
+1. **`requirements_fixed.txt`** - Fixed dependency versions
+2. **`Dockerfile_hf_spaces`** - Optimized Dockerfile for HF Spaces
+3. **`app_hf_spaces.py`** - Main Gradio interface for HF Spaces
+4. **`app.py`** - Main entry point
+5. **`health_check.py`** - Health check endpoint
+6. **`deploy_to_hf_space.py`** - Deployment helper script
+7. **`test_build.py`** - Build verification script
+8. **`README_HF_SPACES.md`** - HF Spaces specific documentation
+### Modified Files:
+1. **`requirements.txt`** - Updated with compatible versions
+## 🔧 Key Changes
+### 1. Dependency Resolution
+- **Fixed `huggingface-hub` version**: `>=0.16.4,<0.19.0`
+- **Compatible tokenizers**: `>=0.14.0,<0.15.0`
+- **Added missing dependencies**: `aiofiles`, `fastapi`, `uvicorn`, etc.
+### 2. Dockerfile Optimization
+- **Base image**: `python:3.10-slim`
+- **System dependencies**: Added all required packages
+- **Script permissions**: Made all scripts executable
+- **Health check**: Added health check endpoint
+- **Environment variables**: Set proper paths and ports
+### 3. Application Structure
+- **Main entry point**: `app.py` detects HF Space vs local
+- **Gradio interface**: `app_hf_spaces.py` provides web UI
+- **Health endpoint**: `/health` for monitoring
+- **Script runner**: All scripts accessible via web interface
+## 🚀 Deployment Steps
+### 1. Prepare Repository
+```bash
+# Run deployment preparation
+python deploy_to_hf_space.py
+# Test the build
+python test_build.py
+```
+### 2. Create Hugging Face Space
+1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
+2. Click "Create new Space"
+3. Choose **Docker** SDK
+4. Set hardware to **GPU Basic** or higher
+5. Connect your repository
+### 3. Configure Space
+- **Dockerfile**: Use `Dockerfile_hf_spaces`
+- **Requirements**: Use `requirements.txt` (already fixed)
+- **Environment Variables** (optional):
+  - `HUGGINGFACE_TOKEN`: Your HF token
+  - `NOVITA_API_KEY`: Your Novita AI key
+### 4. Deploy
+- Push your code to the repository
+- The Space will automatically build
+- Monitor build logs for any issues
+## 🎯 Features Available
+### Web Interface
+- **Setup & Training Tab**: All training scripts
+- **External Services Tab**: Novita AI integration
+- **Scripts Info Tab**: List all available scripts
+### Available Scripts
+1. **`check_training_ready.py`** - Verify setup
+2. **`create_sample_dataset.py`** - Generate training data
+3. **`setup_textilindo_training.py`** - Download models
+4. **`train_textilindo_ai.py`** - Train the model
+5. **`test_textilindo_ai.py`** - Test the model
+6. **`test_novita_connection.py`** - Test external services
+### Health Monitoring
+- **Endpoint**: `/health`
+- **Status**: System health, script count, directories
+- **Logs**: Available in Space logs
+## 🔍 Troubleshooting
+### Common Issues:
+1. **Build Failures**: Check dependency versions
+2. **Memory Issues**: Use GPU Basic or higher
+3. **Script Errors**: Check Space logs
+4. **Model Download**: Ensure HF token is set
+### Debug Steps:
+1. Check `/health` endpoint
+2. Review Space logs
+3. Test individual scripts
+4. Verify environment variables
+## 📊 Expected Results
+After successful deployment:
+- ✅ All dependencies installed without conflicts
+- ✅ Web interface accessible at Space URL
+- ✅ All scripts executable via interface
+- ✅ Health check endpoint working
+- ✅ Ready for AI model training
+## 🎉 Success Criteria
+The fix is successful when:
+1. **Build completes** without dependency errors
+2. **Space starts** and shows web interface
+3. **Health check** returns healthy status
+4. **Scripts can be executed** via web interface
+5. **Training process** can be initiated
+## 📞 Support
+If issues persist:
+1. Check Space build logs
+2. Verify all files are present
+3. Test locally with `test_build.py`
+4. Review dependency versions
+5. Check Hugging Face documentation

QUICK_DEPLOY.md ADDED Viewed

	@@ -0,0 +1,60 @@

+# 🚀 Quick Deploy to Hugging Face Spaces
+## Prerequisites
+- Python 3.8+
+- Git installed
+- Hugging Face account
+## Step 1: Setup
+```bash
+# Install requirements
+pip install -r requirements.txt
+# Run setup script
+python setup_hf_space.py
+```
+## Step 2: Deploy
+```bash
+# Option A: Automated deployment
+python deploy_final.py
+# Option B: Manual deployment
+huggingface-cli login
+huggingface-cli repo create textilindo-ai-assistant --type space --sdk gradio
+```
+## Step 3: Manual Upload (if automated fails)
+1. Go to https://huggingface.co/spaces/[your-username]/textilindo-ai-assistant
+2. Upload these files:
+   - `app.py`
+   - `requirements.txt`
+   - `README.md`
+   - `configs/system_prompt.md`
+   - `data/textilindo_training_data.jsonl`
+## Step 4: Test
+- Wait for build to complete (2-5 minutes)
+- Test your application
+- Share the link!
+## Files Structure
+```
+textilindo-ai-assistant/
+├── app.py                    # Main Gradio application
+├── requirements.txt          # Dependencies
+├── README.md                 # Space configuration
+├── configs/
+│   └── system_prompt.md     # System prompt
+└── data/
+    └── textilindo_training_data.jsonl
+```
+## Troubleshooting
+- **Build fails**: Check requirements.txt versions
+- **App doesn't start**: Check app.py for errors
+- **Data not loading**: Verify data files are uploaded
+- **Memory issues**: Use smaller dataset or optimize code
+## Support
+Check the DEPLOYMENT_GUIDE.md for detailed instructions.

README.md CHANGED Viewed

@@ -6,50 +6,43 @@ colorTo: purple
 sdk: docker
 pinned: false
 license: mit
-app_port: 8080
 ---
 # Textilindo AI Assistant
-AI-powered customer service assistant for Textilindo textile company.
 ## Features
-- 🤖 **Smart AI Assistant**: Answers customer questions about products, shipping, and company policies
-- 📚 **Knowledge Base**: Uses 182+ training examples for context-aware responses
-- 🇮🇩 **Indonesian Language**: Responds in friendly Indonesian language
-- 🛍️ **Sales Focus**: Helps customers with product recommendations and ordering
-## API Endpoints
-- `GET /` - API documentation
-- `GET /health` - Health check
-- `POST /chat` - Chat with AI
-- `GET /stats` - Dataset statistics
-## Usage
-Send a POST request to `/chat` with your message:
-```json
-{
-  "message": "dimana lokasi textilindo?",
-  "max_tokens": 300,
-  "temperature": 0.7
-}
-```
-## Dataset
-The assistant is trained on 182+ examples covering:
-- Company location and hours
-- Product information
-- Shipping and payment policies
-- Customer service scenarios
-## Technology
-- **Backend**: Flask (Python)
-- **AI**: Hugging Face Transformers
-- **Data**: JSONL format with RAG (Retrieval-Augmented Generation)
-- **Deployment**: Hugging Face Spaces

 sdk: docker
 pinned: false
 license: mit
+app_port: 7860
+hardware: gpu-basic
 ---
 # Textilindo AI Assistant
+AI Assistant for Textilindo with training and inference capabilities.
 ## Features
+- 🤖 AI model training with LoRA
+- 📊 Dataset creation and management
+- 🧪 Model testing and inference
+- 🔗 External service integration
+- 📱 Web interface for all operations
+## Usage
+1. **Check Training Ready**: Verify all components are ready
+2. **Create Dataset**: Generate sample training data
+3. **Setup Training**: Download models and setup environment
+4. **Train Model**: Start the training process
+5. **Test Model**: Interact with the trained model
+## Hardware Requirements
+- **Minimum**: CPU Basic (2 vCPU, 8GB RAM)
+- **Recommended**: GPU Basic (1 T4 GPU, 16GB RAM)
+- **For Training**: GPU A10G or higher
+## Environment Variables
+Set these in your Space settings:
+- `HUGGINGFACE_TOKEN`: Your Hugging Face token (optional)
+- `NOVITA_API_KEY`: Your Novita AI API key (optional)
+## Support
+For issues and questions, check the logs and health endpoint.

README_HF_SPACES.md ADDED Viewed

	@@ -0,0 +1,166 @@

+# Textilindo AI Assistant - Hugging Face Spaces
+This is the Hugging Face Spaces deployment version of the Textilindo AI Assistant.
+## 🚀 Quick Start
+1. **Fork this repository**
+2. **Create a new Hugging Face Space**
+3. **Use the following settings:**
+   - **SDK**: Docker
+   - **Hardware**: CPU Basic (or GPU if available)
+   - **Visibility**: Public or Private
+## 📁 Files Structure
+```
+textilindo-ai-inference/
+├── app.py                          # Main entry point
+├── app_hf_spaces.py               # HF Spaces specific app
+├── health_check.py                # Health check endpoint
+├── Dockerfile_hf_spaces           # Optimized Dockerfile for HF Spaces
+├── requirements_fixed.txt         # Fixed dependencies
+├── scripts/                       # All training and utility scripts
+│   ├── check_training_ready.py
+│   ├── create_sample_dataset.py
+│   ├── setup_textilindo_training.py
+│   ├── train_textilindo_ai.py
+│   ├── test_textilindo_ai.py
+│   └── ... (all other scripts)
+├── configs/                       # Configuration files
+├── data/                          # Training data
+└── models/                        # Model storage
+```
+## 🔧 Configuration
+### Environment Variables
+Set these in your Hugging Face Space settings:
+```bash
+# Optional: Hugging Face Hub token for model downloads
+HUGGINGFACE_TOKEN=your_token_here
+# Optional: Novita AI API key for external training
+NOVITA_API_KEY=your_novita_key_here
+# Python path
+PYTHONPATH=/app
+```
+### Hardware Requirements
+- **Minimum**: CPU Basic (2 vCPU, 8GB RAM)
+- **Recommended**: GPU Basic (1 T4 GPU, 16GB RAM)
+- **For Training**: GPU A10G or higher
+## 🎯 Features
+### Available Scripts
+The interface provides access to all training and utility scripts:
+1. **Setup & Training**
+   - `check_training_ready.py` - Verify all components are ready
+   - `setup_textilindo_training.py` - Download models and setup environment
+   - `train_textilindo_ai.py` - Train the AI model with LoRA
+   - `create_sample_dataset.py` - Create sample training data
+2. **Testing & Inference**
+   - `test_textilindo_ai.py` - Test the trained model
+   - `inference_textilindo_ai.py` - Run inference with the model
+3. **External Services**
+   - `test_novita_connection.py` - Test Novita AI connection
+   - `novita_ai_setup.py` - Setup Novita AI integration
+## 🚀 Usage
+1. **Access the Space**: Visit your deployed Hugging Face Space
+2. **Check Status**: Use the "Check Training Ready" button to verify setup
+3. **Create Dataset**: Use "Create Sample Dataset" to generate training data
+4. **Setup Training**: Use "Setup Training" to download models
+5. **Train Model**: Use "Train Model" to start the training process
+6. **Test Model**: Use "Test Model" to interact with the trained model
+## 📊 Training Process
+### Step 1: Check Readiness
+```bash
+python scripts/check_training_ready.py
+```
+### Step 2: Create Dataset
+```bash
+python scripts/create_sample_dataset.py
+```
+### Step 3: Setup Training
+```bash
+python scripts/setup_textilindo_training.py
+```
+### Step 4: Train Model
+```bash
+python scripts/train_textilindo_ai.py
+```
+### Step 5: Test Model
+```bash
+python scripts/test_textilindo_ai.py
+```
+## 🔍 Troubleshooting
+### Common Issues
+1. **Dependency Conflicts**
+   - Use `requirements_fixed.txt` instead of `requirements.txt`
+   - The fixed version resolves huggingface-hub conflicts
+2. **Memory Issues**
+   - Use CPU Basic for inference only
+   - Use GPU Basic or higher for training
+   - Consider using smaller models for limited resources
+3. **Script Execution**
+   - All scripts are made executable in the Dockerfile
+   - Check the output logs for detailed error messages
+4. **Model Download**
+   - Ensure you have a valid HUGGINGFACE_TOKEN
+   - Some models may require authentication
+### Health Check
+Visit `/health` endpoint to check the application status:
+```bash
+curl https://your-space-name.hf.space/health
+```
+## 📝 Notes
+- **Training Time**: Training can take 1-3 hours depending on hardware
+- **Storage**: Models and data are stored in the Space's persistent storage
+- **Logs**: Check the Space logs for detailed execution information
+- **Restart**: You may need to restart the Space after training
+## 🆘 Support
+If you encounter issues:
+1. Check the Space logs
+2. Verify all environment variables are set
+3. Ensure you have sufficient hardware resources
+4. Check the health endpoint for system status
+## 🔄 Updates
+To update the Space:
+1. Push changes to your repository
+2. The Space will automatically rebuild
+3. Check the build logs for any issues
+4. Restart if necessary

Textilindo-2 CHANGED Viewed

	@@ -1 +1 @@
1	- Subproject commit ~~741eedf932ff96044ac7a399537c7a077316669a~~


1	+ Subproject commit 60664bc394261bea116661f76431aaafd5f5eaab

app.py CHANGED Viewed

@@ -1,285 +1,33 @@
 #!/usr/bin/env python3
 """
-Textilindo AI Assistant - Hugging Face Spaces
 """
-import gradio as gr
 import os
-import json
-import requests
-from difflib import SequenceMatcher
-import logging
-# Setup logging
-logging.basicConfig(level=logging.INFO)
-logger = logging.getLogger(__name__)
-def load_system_prompt(default_text):
-    """Load system prompt from configs/system_prompt.md if available"""
-    try:
-        base_dir = os.path.dirname(__file__)
-        md_path = os.path.join(base_dir, 'configs', 'system_prompt.md')
-        if not os.path.exists(md_path):
-            return default_text
-        with open(md_path, 'r', encoding='utf-8') as f:
-            content = f.read()
-        start = content.find('"""')
-        end = content.rfind('"""')
-        if start != -1 and end != -1 and end > start:
-            return content[start+3:end].strip()
-        lines = []
-        for line in content.splitlines():
-            if line.strip().startswith('#'):
-                continue
-            lines.append(line)
-        cleaned = '\n'.join(lines).strip()
-        return cleaned or default_text
-    except Exception:
-        return default_text
-class TextilindoAI:
-    def __init__(self):
-        self.system_prompt = os.getenv(
-            'SYSTEM_PROMPT',
-            load_system_prompt("You are Textilindo AI Assistant. Be concise, helpful, and use Indonesian.")
-        )
-        self.dataset = self.load_all_datasets()
-    def load_all_datasets(self):
-        """Load all available datasets"""
-        dataset = []
-        # Try multiple possible data directory paths
-        possible_data_dirs = [
-            "data",
-            "./data",
-            "/app/data",
-            os.path.join(os.path.dirname(__file__), "data")
-        ]
-        data_dir = None
-        for dir_path in possible_data_dirs:
-            if os.path.exists(dir_path):
-                data_dir = dir_path
-                logger.info(f"Found data directory: {data_dir}")
-                break
-        if not data_dir:
-            logger.warning("No data directory found in any of the expected locations")
-            return dataset
-        # Load all JSONL files
-        try:
-            for filename in os.listdir(data_dir):
-                if filename.endswith('.jsonl'):
-                    filepath = os.path.join(data_dir, filename)
-                    try:
-                        with open(filepath, 'r', encoding='utf-8') as f:
-                            for line_num, line in enumerate(f, 1):
-                                line = line.strip()
-                                if line:
-                                    try:
-                                        data = json.loads(line)
-                                        dataset.append(data)
-                                    except json.JSONDecodeError as e:
-                                        logger.warning(f"Invalid JSON in {filename} line {line_num}: {e}")
-                                        continue
-                        logger.info(f"Loaded {filename}: {len([d for d in dataset if d.get('instruction')])} examples")
-                    except Exception as e:
-                        logger.error(f"Error loading {filename}: {e}")
-        except Exception as e:
-            logger.error(f"Error reading data directory {data_dir}: {e}")
-        logger.info(f"Total examples loaded: {len(dataset)}")
-        return dataset
-    def find_relevant_context(self, user_query, top_k=3):
-        """Find most relevant examples from dataset"""
-        if not self.dataset:
-            return []
-        scores = []
-        for i, example in enumerate(self.dataset):
-            instruction = example.get('instruction', '').lower()
-            output = example.get('output', '').lower()
-            query = user_query.lower()
-            instruction_score = SequenceMatcher(None, query, instruction).ratio()
-            output_score = SequenceMatcher(None, query, output).ratio()
-            combined_score = (instruction_score * 0.7) + (output_score * 0.3)
-            scores.append((combined_score, i))
-        scores.sort(reverse=True)
-        relevant_examples = []
-        for score, idx in scores[:top_k]:
-            if score > 0.1:
-                relevant_examples.append(self.dataset[idx])
-        return relevant_examples
-    def create_context_prompt(self, user_query, relevant_examples):
-        """Create a prompt with relevant context"""
-        if not relevant_examples:
-            return user_query
-        context_parts = []
-        context_parts.append("Berikut adalah beberapa contoh pertanyaan dan jawaban tentang Textilindo:")
-        context_parts.append("")
-        for i, example in enumerate(relevant_examples, 1):
-            instruction = example.get('instruction', '')
-            output = example.get('output', '')
-            context_parts.append(f"Contoh {i}:")
-            context_parts.append(f"Pertanyaan: {instruction}")
-            context_parts.append(f"Jawaban: {output}")
-            context_parts.append("")
-        context_parts.append("Berdasarkan contoh di atas, jawab pertanyaan berikut:")
-        context_parts.append(f"Pertanyaan: {user_query}")
-        context_parts.append("Jawaban:")
-        return "\n".join(context_parts)
-    def chat(self, message, max_tokens=300, temperature=0.7):
-        """Generate response using Hugging Face Spaces"""
-        relevant_examples = self.find_relevant_context(message, 3)
-        if relevant_examples:
-            enhanced_prompt = self.create_context_prompt(message, relevant_examples)
-            context_used = True
-        else:
-            enhanced_prompt = message
-            context_used = False
-        # For now, return a simple response
-        # In production, this would call your HF Space inference endpoint
-        response = f"Terima kasih atas pertanyaan Anda: {message}. Saya akan membantu Anda dengan informasi tentang Textilindo."
-        return {
-            "success": True,
-            "response": response,
-            "context_used": context_used,
-            "relevant_examples_count": len(relevant_examples)
-        }
-# Initialize AI
-ai = TextilindoAI()
-@app.route('/health', methods=['GET'])
-def health_check():
-    """Health check endpoint"""
-    return jsonify({
-        "status": "healthy",
-        "service": "Textilindo AI Assistant",
-        "dataset_loaded": len(ai.dataset) > 0,
-        "dataset_size": len(ai.dataset)
-    })
-@app.route('/chat', methods=['POST'])
-def chat():
-    """Main chat endpoint"""
-    try:
-        data = request.get_json()
-        if not data:
-            return jsonify({
-                "success": False,
-                "error": "No JSON data provided"
-            }), 400
-        message = data.get('message', '').strip()
-        if not message:
-            return jsonify({
-                "success": False,
-                "error": "Message is required"
-            }), 400
-        # Optional parameters
-        max_tokens = data.get('max_tokens', 300)
-        temperature = data.get('temperature', 0.7)
-        # Process chat
-        result = ai.chat(message, max_tokens, temperature)
-        if result["success"]:
-            return jsonify(result)
-        else:
-            return jsonify(result), 500
-    except Exception as e:
-        logger.error(f"Error in chat endpoint: {e}")
-        return jsonify({
-            "success": False,
-            "error": f"Internal server error: {str(e)}"
-        }), 500
-@app.route('/stats', methods=['GET'])
-def get_stats():
-    """Get dataset and system statistics"""
-    try:
-        topics = {}
-        for example in ai.dataset:
-            metadata = example.get('metadata', {})
-            topic = metadata.get('topic', 'unknown')
-            topics[topic] = topics.get(topic, 0) + 1
-        return jsonify({
-            "success": True,
-            "dataset": {
-                "total_examples": len(ai.dataset),
-                "topics": topics,
-                "topics_count": len(topics)
-            },
-            "system": {
-                "api_version": "1.0.0",
-                "status": "operational"
-            }
-        })
-    except Exception as e:
-        logger.error(f"Error in stats endpoint: {e}")
-        return jsonify({
-            "success": False,
-            "error": f"Internal server error: {str(e)}"
-        }), 500
-@app.route('/', methods=['GET'])
-def root():
-    """API root endpoint with documentation"""
-    return jsonify({
-        "service": "Textilindo AI Assistant",
-        "version": "1.0.0",
-        "description": "AI-powered customer service for Textilindo",
-        "endpoints": {
-            "GET /": "API documentation (this endpoint)",
-            "GET /health": "Health check",
-            "POST /chat": "Chat with AI",
-            "GET /stats": "Dataset and system statistics"
-        },
-        "usage": {
-            "chat": {
-                "method": "POST",
-                "url": "/chat",
-                "body": {
-                    "message": "string (required)",
-                    "max_tokens": "integer (optional, default: 300)",
-                    "temperature": "float (optional, default: 0.7)"
-                }
-            }
-        },
-        "dataset_size": len(ai.dataset)
-    })
-if __name__ == '__main__':
-    logger.info("Starting Textilindo AI Assistant...")
-    logger.info(f"Dataset loaded: {len(ai.dataset)} examples")
-    # For Hugging Face Spaces, use the PORT environment variable
-    port = int(os.environ.get('PORT', 7860))
-    app.run(
-        debug=False,
-        host='0.0.0.0',
-        port=port
-    )

 #!/usr/bin/env python3
 """
+Main application entry point for Hugging Face Spaces
 """
 import os
+import sys
+from pathlib import Path
+def main():
+    """Main entry point"""
+    print("🚀 Textilindo AI Assistant - Starting...")
+    # Check if we're in a Hugging Face Space
+    if os.getenv("SPACE_ID"):
+        print("🌐 Running in Hugging Face Space")
+        # Import and run the HF Spaces app
+        try:
+            from app_hf_spaces import main as run_hf_app
+            run_hf_app()
+        except ImportError as e:
+            print(f"❌ Error importing HF app: {e}")
+            # Fallback to health check
+            from health_check import app
+            app.run(host="0.0.0.0", port=7860, debug=False)
+    else:
+        print("💻 Running locally")
+        # Run the health check server
+        from health_check import app
+        app.run(host="0.0.0.0", port=7860, debug=True)
+if __name__ == "__main__":
+    main()

app_hf_spaces.py ADDED Viewed

	@@ -0,0 +1,182 @@

+#!/usr/bin/env python3
+"""
+Hugging Face Spaces App for Textilindo AI Assistant
+Main entry point that can run scripts and serve the application
+"""
+import os
+import sys
+import subprocess
+import gradio as gr
+from pathlib import Path
+import logging
+# Setup logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+def run_script(script_name, *args):
+    """Run a script from the scripts directory"""
+    script_path = Path("scripts") / f"{script_name}.py"
+    if not script_path.exists():
+        return f"❌ Script not found: {script_path}"
+    try:
+        # Run the script
+        cmd = [sys.executable, str(script_path)] + list(args)
+        result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
+        if result.returncode == 0:
+            return f"✅ Script executed successfully:\n{result.stdout}"
+        else:
+            return f"❌ Script failed:\n{result.stderr}"
+    except subprocess.TimeoutExpired:
+        return "❌ Script timed out after 5 minutes"
+    except Exception as e:
+        return f"❌ Error running script: {e}"
+def check_training_ready():
+    """Check if everything is ready for training"""
+    return run_script("check_training_ready")
+def create_sample_dataset():
+    """Create a sample dataset"""
+    return run_script("create_sample_dataset")
+def test_novita_connection():
+    """Test Novita AI connection"""
+    return run_script("test_novita_connection")
+def setup_textilindo_training():
+    """Setup Textilindo training environment"""
+    return run_script("setup_textilindo_training")
+def train_textilindo_ai():
+    """Train Textilindo AI model"""
+    return run_script("train_textilindo_ai")
+def test_textilindo_ai():
+    """Test the trained Textilindo AI model"""
+    return run_script("test_textilindo_ai")
+def list_available_scripts():
+    """List all available scripts"""
+    scripts_dir = Path("scripts")
+    if not scripts_dir.exists():
+        return "❌ Scripts directory not found"
+    scripts = []
+    for script_file in scripts_dir.glob("*.py"):
+        if script_file.name != "__init__.py":
+            scripts.append(f"📄 {script_file.name}")
+    if scripts:
+        return "📋 Available Scripts:\n" + "\n".join(scripts)
+    else:
+        return "❌ No scripts found"
+def create_interface():
+    """Create the Gradio interface"""
+    with gr.Blocks(title="Textilindo AI Assistant - Script Runner") as interface:
+        gr.Markdown("""
+        # 🤖 Textilindo AI Assistant - Script Runner
+        This interface allows you to run various scripts for the Textilindo AI Assistant.
+        """)
+        with gr.Tab("Setup & Training"):
+            gr.Markdown("### Setup and Training Scripts")
+            with gr.Row():
+                check_btn = gr.Button("🔍 Check Training Ready", variant="secondary")
+                setup_btn = gr.Button("⚙️ Setup Training", variant="primary")
+                train_btn = gr.Button("🚀 Train Model", variant="primary")
+            with gr.Row():
+                dataset_btn = gr.Button("📊 Create Sample Dataset", variant="secondary")
+                test_btn = gr.Button("🧪 Test Model", variant="secondary")
+        with gr.Tab("External Services"):
+            gr.Markdown("### External Service Integration")
+            with gr.Row():
+                novita_btn = gr.Button("🔗 Test Novita AI Connection", variant="secondary")
+        with gr.Tab("Scripts Info"):
+            gr.Markdown("### Available Scripts")
+            with gr.Row():
+                list_btn = gr.Button("📋 List All Scripts", variant="secondary")
+        # Output area
+        output = gr.Textbox(
+            label="Output",
+            lines=20,
+            max_lines=30,
+            show_copy_button=True
+        )
+        # Event handlers
+        check_btn.click(
+            check_training_ready,
+            outputs=output
+        )
+        setup_btn.click(
+            setup_textilindo_training,
+            outputs=output
+        )
+        train_btn.click(
+            train_textilindo_ai,
+            outputs=output
+        )
+        dataset_btn.click(
+            create_sample_dataset,
+            outputs=output
+        )
+        test_btn.click(
+            test_textilindo_ai,
+            outputs=output
+        )
+        novita_btn.click(
+            test_novita_connection,
+            outputs=output
+        )
+        list_btn.click(
+            list_available_scripts,
+            outputs=output
+        )
+    return interface
+def main():
+    """Main function"""
+    print("🚀 Starting Textilindo AI Assistant - Hugging Face Spaces")
+    print("=" * 60)
+    # Check if we're in the right directory
+    if not Path("scripts").exists():
+        print("❌ Scripts directory not found. Please ensure you're in the correct directory.")
+        sys.exit(1)
+    # Create and launch the interface
+    interface = create_interface()
+    # Launch the interface
+    interface.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,
+        debug=False
+    )
+if __name__ == "__main__":
+    main()

deploy_final.py ADDED Viewed

	@@ -0,0 +1,190 @@

+#!/usr/bin/env python3
+"""
+Final deployment script for Textilindo AI Assistant to Hugging Face Spaces
+"""
+import os
+import subprocess
+import sys
+from pathlib import Path
+def check_huggingface_cli():
+    """Check if huggingface-cli is available"""
+    try:
+        result = subprocess.run(["huggingface-cli", "--version"], capture_output=True, text=True)
+        if result.returncode == 0:
+            print("✅ Hugging Face CLI is available")
+            return True
+        else:
+            print("❌ Hugging Face CLI not found")
+            return False
+    except FileNotFoundError:
+        print("❌ Hugging Face CLI not found")
+        return False
+def login_to_huggingface():
+    """Login to Hugging Face"""
+    print("🔐 Logging in to Hugging Face...")
+    try:
+        subprocess.run(["huggingface-cli", "login"], check=True)
+        print("✅ Successfully logged in to Hugging Face")
+        return True
+    except subprocess.CalledProcessError:
+        print("❌ Failed to login to Hugging Face")
+        return False
+def create_space():
+    """Create a new Hugging Face Space"""
+    space_name = input("Enter your Hugging Face username: ").strip()
+    if not space_name:
+        print("❌ Username is required")
+        return None
+    space_repo = f"{space_name}/textilindo-ai-assistant"
+    print(f"🚀 Creating space: {space_repo}")
+    try:
+        subprocess.run([
+            "huggingface-cli", "repo", "create",
+            "textilindo-ai-assistant",
+            "--type", "space",
+            "--sdk", "gradio"
+        ], check=True)
+        print(f"✅ Space created successfully: https://huggingface.co/spaces/{space_repo}")
+        return space_repo
+    except subprocess.CalledProcessError:
+        print("❌ Failed to create space")
+        return None
+def prepare_files():
+    """Prepare files for deployment"""
+    print("📁 Preparing files for deployment...")
+    # Check if all required files exist
+    required_files = [
+        "app.py",
+        "requirements.txt",
+        "README.md",
+        "configs/system_prompt.md",
+        "data/textilindo_training_data.jsonl"
+    ]
+    missing_files = []
+    for file in required_files:
+        if not os.path.exists(file):
+            missing_files.append(file)
+    if missing_files:
+        print(f"❌ Missing required files: {missing_files}")
+        return False
+    print("✅ All required files are present")
+    return True
+def deploy_files(space_repo):
+    """Deploy files to the space"""
+    print(f"📤 Deploying files to {space_repo}...")
+    # Clone the space repository
+    clone_url = f"https://huggingface.co/spaces/{space_repo}"
+    temp_dir = "temp_space"
+    try:
+        # Remove temp directory if it exists
+        if os.path.exists(temp_dir):
+            import shutil
+            shutil.rmtree(temp_dir)
+        # Clone the repository
+        subprocess.run(["git", "clone", clone_url, temp_dir], check=True)
+        print("✅ Repository cloned successfully")
+        # Copy files to the cloned repository
+        files_to_copy = [
+            "app.py",
+            "requirements.txt",
+            "README.md",
+            "configs/",
+            "data/"
+        ]
+        for file in files_to_copy:
+            if os.path.exists(file):
+                if os.path.isdir(file):
+                    # Copy directory
+                    subprocess.run(["cp", "-r", file, temp_dir], check=True)
+                else:
+                    # Copy file
+                    subprocess.run(["cp", file, temp_dir], check=True)
+                print(f"✅ Copied {file}")
+        # Change to the cloned directory
+        os.chdir(temp_dir)
+        # Add all files to git
+        subprocess.run(["git", "add", "."], check=True)
+        # Commit changes
+        subprocess.run(["git", "commit", "-m", "Initial deployment of Textilindo AI Assistant"], check=True)
+        # Push to the space
+        subprocess.run(["git", "push"], check=True)
+        print("✅ Files deployed successfully!")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Deployment failed: {e}")
+        return False
+    finally:
+        # Clean up
+        if os.path.exists(temp_dir):
+            import shutil
+            shutil.rmtree(temp_dir)
+        os.chdir("..")
+def main():
+    print("🚀 Textilindo AI Assistant - Hugging Face Spaces Deployment")
+    print("=" * 60)
+    # Check if we're in the right directory
+    if not os.path.exists("app.py"):
+        print("❌ app.py not found. Please run this script from the project root directory.")
+        return
+    # Prepare files
+    if not prepare_files():
+        return
+    # Check Hugging Face CLI
+    if not check_huggingface_cli():
+        print("📦 Installing Hugging Face CLI...")
+        try:
+            subprocess.run([sys.executable, "-m", "pip", "install", "huggingface_hub"], check=True)
+            print("✅ Hugging Face CLI installed")
+        except subprocess.CalledProcessError:
+            print("❌ Failed to install Hugging Face CLI")
+            return
+    # Login to Hugging Face
+    if not login_to_huggingface():
+        return
+    # Create space
+    space_repo = create_space()
+    if not space_repo:
+        return
+    # Deploy files
+    if deploy_files(space_repo):
+        print("\n🎉 Deployment completed successfully!")
+        print(f"🌐 Your app is available at: https://huggingface.co/spaces/{space_repo}")
+        print("\n📋 Next steps:")
+        print("1. Wait for the space to build (usually takes 2-5 minutes)")
+        print("2. Test your application")
+        print("3. Share the link with others!")
+    else:
+        print("\n❌ Deployment failed. Please check the error messages above.")
+if __name__ == "__main__":
+    main()

deploy_to_hf_space.py ADDED Viewed

	@@ -0,0 +1,207 @@

+#!/usr/bin/env python3
+"""
+Deployment script for Hugging Face Spaces
+"""
+import os
+import sys
+import subprocess
+from pathlib import Path
+def check_requirements():
+    """Check if required tools are available"""
+    print("🔍 Checking requirements...")
+    # Check if git is available
+    try:
+        subprocess.run(["git", "--version"], capture_output=True, check=True)
+        print("✅ Git available")
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        print("❌ Git not found. Please install git.")
+        return False
+    # Check if huggingface_hub is available
+    try:
+        import huggingface_hub
+        print("✅ Hugging Face Hub available")
+    except ImportError:
+        print("❌ Hugging Face Hub not found. Install with: pip install huggingface_hub")
+        return False
+    return True
+def setup_git_lfs():
+    """Setup Git LFS for large files"""
+    print("📁 Setting up Git LFS...")
+    try:
+        subprocess.run(["git", "lfs", "install"], check=True)
+        print("✅ Git LFS installed")
+        return True
+    except subprocess.CalledProcessError:
+        print("❌ Failed to install Git LFS")
+        return False
+def create_gitignore():
+    """Create .gitignore for the project"""
+    print("📝 Creating .gitignore...")
+    gitignore_content = """
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual environments
+venv/
+env/
+ENV/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Model files (too large for git)
+models/
+*.bin
+*.safetensors
+*.pt
+*.pth
+# Cache
+.cache/
+__pycache__/
+# Logs
+*.log
+logs/
+# Temporary files
+*.tmp
+*.temp
+# Environment variables
+.env
+.env.local
+.env.production
+# Hugging Face
+.huggingface/
+transformers_cache/
+"""
+    with open(".gitignore", "w") as f:
+        f.write(gitignore_content.strip())
+    print("✅ .gitignore created")
+def create_readme():
+    """Create README.md for the Space"""
+    print("📖 Creating README.md...")
+    readme_content = """---
+title: Textilindo AI Assistant
+emoji: 🤖
+colorFrom: blue
+colorTo: purple
+sdk: docker
+pinned: false
+license: mit
+app_port: 7860
+---
+# Textilindo AI Assistant
+AI Assistant for Textilindo with training and inference capabilities.
+## Features
+- 🤖 AI model training with LoRA
+- 📊 Dataset creation and management
+- 🧪 Model testing and inference
+- 🔗 External service integration
+- 📱 Web interface for all operations
+## Usage
+1. **Check Training Ready**: Verify all components are ready
+2. **Create Dataset**: Generate sample training data
+3. **Setup Training**: Download models and setup environment
+4. **Train Model**: Start the training process
+5. **Test Model**: Interact with the trained model
+## Hardware Requirements
+- **Minimum**: CPU Basic (2 vCPU, 8GB RAM)
+- **Recommended**: GPU Basic (1 T4 GPU, 16GB RAM)
+- **For Training**: GPU A10G or higher
+## Environment Variables
+Set these in your Space settings:
+- `HUGGINGFACE_TOKEN`: Your Hugging Face token (optional)
+- `NOVITA_API_KEY`: Your Novita AI API key (optional)
+## Support
+For issues and questions, check the logs and health endpoint.
+"""
+    with open("README.md", "w") as f:
+        f.write(readme_content)
+    print("✅ README.md created")
+def main():
+    """Main deployment function"""
+    print("🚀 Textilindo AI Assistant - Hugging Face Spaces Deployment")
+    print("=" * 60)
+    # Check requirements
+    if not check_requirements():
+        print("❌ Requirements not met. Please install missing tools.")
+        sys.exit(1)
+    # Setup Git LFS
+    if not setup_git_lfs():
+        print("❌ Failed to setup Git LFS")
+        sys.exit(1)
+    # Create necessary files
+    create_gitignore()
+    create_readme()
+    print("\n✅ Deployment preparation complete!")
+    print("\n📋 Next steps:")
+    print("1. Create a new Hugging Face Space")
+    print("2. Use Docker SDK")
+    print("3. Set hardware to GPU Basic or higher")
+    print("4. Push your code to the Space repository")
+    print("5. Set environment variables if needed")
+    print("\n🔗 Your Space will be available at: https://huggingface.co/spaces/your-username/your-space-name")
+if __name__ == "__main__":
+    main()

health_check.py ADDED Viewed

	@@ -0,0 +1,57 @@

+#!/usr/bin/env python3
+"""
+Health check endpoint for Hugging Face Spaces
+"""
+from flask import Flask, jsonify
+import os
+from pathlib import Path
+app = Flask(__name__)
+@app.route('/health')
+def health_check():
+    """Health check endpoint"""
+    try:
+        # Check if required directories exist
+        required_dirs = ['scripts', 'configs', 'data']
+        missing_dirs = []
+        for dir_name in required_dirs:
+            if not Path(dir_name).exists():
+                missing_dirs.append(dir_name)
+        # Check if scripts directory has files
+        scripts_dir = Path("scripts")
+        script_count = len(list(scripts_dir.glob("*.py"))) if scripts_dir.exists() else 0
+        status = {
+            "status": "healthy" if not missing_dirs else "degraded",
+            "missing_directories": missing_dirs,
+            "scripts_available": script_count,
+            "python_version": os.sys.version,
+            "working_directory": str(Path.cwd())
+        }
+        return jsonify(status)
+    except Exception as e:
+        return jsonify({
+            "status": "unhealthy",
+            "error": str(e)
+        }), 500
+@app.route('/')
+def root():
+    """Root endpoint"""
+    return jsonify({
+        "message": "Textilindo AI Assistant - Hugging Face Spaces",
+        "status": "running",
+        "endpoints": {
+            "health": "/health",
+            "app": "/app"
+        }
+    })
+if __name__ == "__main__":
+    app.run(host="0.0.0.0", port=7860, debug=False)

push_to_hf_space.py ADDED Viewed

	@@ -0,0 +1,269 @@

+#!/usr/bin/env python3
+"""
+Script to push Textilindo AI Assistant to Hugging Face Spaces
+"""
+import os
+import sys
+import subprocess
+import shutil
+from pathlib import Path
+def check_git_status():
+    """Check git status and setup"""
+    print("🔍 Checking Git status...")
+    try:
+        # Check if we're in a git repository
+        result = subprocess.run(["git", "status"], capture_output=True, text=True)
+        if result.returncode != 0:
+            print("❌ Not in a git repository. Initializing...")
+            subprocess.run(["git", "init"], check=True)
+            print("✅ Git repository initialized")
+        else:
+            print("✅ Git repository found")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Git error: {e}")
+        return False
+    except FileNotFoundError:
+        print("❌ Git not found. Please install git.")
+        return False
+def setup_git_lfs():
+    """Setup Git LFS for large files"""
+    print("📁 Setting up Git LFS...")
+    try:
+        # Install Git LFS
+        subprocess.run(["git", "lfs", "install"], check=True)
+        print("✅ Git LFS installed")
+        # Create .gitattributes for LFS
+        gitattributes = """
+# Large model files
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.hdf5 filter=lfs diff=lfs merge=lfs -text
+*.tar.gz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+"""
+        with open(".gitattributes", "w") as f:
+            f.write(gitattributes.strip())
+        print("✅ .gitattributes created")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Git LFS setup failed: {e}")
+        return False
+def create_gitignore():
+    """Create comprehensive .gitignore"""
+    print("📝 Creating .gitignore...")
+    gitignore_content = """
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual environments
+venv/
+env/
+ENV/
+.venv/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*.sublime-*
+# OS
+.DS_Store
+Thumbs.db
+*.tmp
+*.temp
+# Model files (use LFS for these)
+models/
+*.bin
+*.safetensors
+*.pt
+*.pth
+*.ckpt
+# Cache
+.cache/
+__pycache__/
+transformers_cache/
+.huggingface/
+# Logs
+*.log
+logs/
+wandb/
+# Environment variables
+.env
+.env.local
+.env.production
+.env.staging
+# Jupyter
+.ipynb_checkpoints/
+# PyTorch
+*.pth
+*.pt
+# Data files (use LFS for large ones)
+data/*.jsonl
+data/*.json
+data/*.csv
+data/*.parquet
+"""
+    with open(".gitignore", "w") as f:
+        f.write(gitignore_content.strip())
+    print("✅ .gitignore created")
+def prepare_dockerfile():
+    """Ensure we're using the correct Dockerfile"""
+    print("🐳 Preparing Dockerfile...")
+    # Copy the HF Spaces Dockerfile to the root
+    if Path("Dockerfile_hf_spaces").exists():
+        shutil.copy("Dockerfile_hf_spaces", "Dockerfile")
+        print("✅ Dockerfile prepared for HF Spaces")
+        return True
+    else:
+        print("❌ Dockerfile_hf_spaces not found")
+        return False
+def create_space_config():
+    """Create Hugging Face Space configuration"""
+    print("⚙️ Creating Space configuration...")
+    # Create .huggingface directory
+    hf_dir = Path(".huggingface")
+    hf_dir.mkdir(exist_ok=True)
+    # Create space configuration
+    space_config = {
+        "title": "Textilindo AI Assistant",
+        "emoji": "🤖",
+        "colorFrom": "blue",
+        "colorTo": "purple",
+        "sdk": "docker",
+        "pinned": False,
+        "license": "mit",
+        "app_port": 7860,
+        "hardware": "gpu-basic"
+    }
+    # Write to README.md frontmatter (already done above)
+    print("✅ Space configuration ready")
+def commit_and_push():
+    """Commit and push to repository"""
+    print("📤 Committing and pushing...")
+    try:
+        # Add all files
+        subprocess.run(["git", "add", "."], check=True)
+        print("✅ Files staged")
+        # Commit
+        commit_message = "Initial commit: Textilindo AI Assistant for HF Spaces"
+        subprocess.run(["git", "commit", "-m", commit_message], check=True)
+        print("✅ Changes committed")
+        # Check if remote exists
+        result = subprocess.run(["git", "remote", "-v"], capture_output=True, text=True)
+        if not result.stdout.strip():
+            print("⚠️ No remote repository found.")
+            print("Please add your Hugging Face Space repository as remote:")
+            print("git remote add origin https://huggingface.co/spaces/your-username/your-space-name")
+            return False
+        # Push to remote
+        subprocess.run(["git", "push", "origin", "main"], check=True)
+        print("✅ Pushed to remote repository")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Git operation failed: {e}")
+        return False
+def main():
+    """Main deployment function"""
+    print("🚀 Textilindo AI Assistant - Push to Hugging Face Spaces")
+    print("=" * 60)
+    # Check if we're in the right directory
+    if not Path("scripts").exists():
+        print("❌ Scripts directory not found. Please run from the project root.")
+        sys.exit(1)
+    # Step 1: Check git status
+    if not check_git_status():
+        sys.exit(1)
+    # Step 2: Setup Git LFS
+    if not setup_git_lfs():
+        sys.exit(1)
+    # Step 3: Create .gitignore
+    create_gitignore()
+    # Step 4: Prepare Dockerfile
+    if not prepare_dockerfile():
+        sys.exit(1)
+    # Step 5: Create space configuration
+    create_space_config()
+    # Step 6: Commit and push
+    if commit_and_push():
+        print("\n🎉 Successfully pushed to Hugging Face Spaces!")
+        print("\n📋 Next steps:")
+        print("1. Go to your Hugging Face Space")
+        print("2. Check the build logs")
+        print("3. Set environment variables if needed")
+        print("4. Test the application")
+    else:
+        print("\n❌ Failed to push. Please check the errors above.")
+        print("\n💡 Manual steps:")
+        print("1. Add remote: git remote add origin <your-hf-space-url>")
+        print("2. Push: git push origin main")
+if __name__ == "__main__":
+    main()

requirements.txt CHANGED Viewed

@@ -1,5 +1,42 @@
-gradio>=4.0.0
-requests>=2.31.0
-numpy>=1.24.0
-pandas>=2.0.0
-python-dotenv>=1.0.0

+# Core ML/AI packages with compatible versions
+torch==2.1.0
+transformers==4.35.0
+accelerate==0.24.0
+peft==0.6.0
+datasets==2.14.0
+scikit-learn==1.3.0
+# Hugging Face Hub - use compatible version
+huggingface-hub>=0.16.4,<0.19.0
+# Web framework
+gradio==4.44.0
+flask==3.0.0
+flask-cors==4.0.0
+# Utilities
+requests==2.31.0
+numpy==1.24.3
+# Additional dependencies for Hugging Face Spaces
+aiofiles>=22.0
+fastapi>=0.100.0
+uvicorn>=0.20.0
+python-multipart>=0.0.9
+# For tokenizers compatibility
+tokenizers>=0.14.0,<0.15.0
+# For datasets compatibility
+pyarrow>=8.0.0
+dill>=0.3.0
+xxhash
+multiprocess
+# For accelerate
+psutil
+# For scikit-learn
+scipy>=1.5.0
+joblib>=1.1.1
+threadpoolctl>=2.0.0

requirements_fixed.txt CHANGED Viewed

@@ -1,7 +1,42 @@
-flask>=2.3.0
-gunicorn>=21.0.0
-requests>=2.31.0
-openai>=1.0.0
-numpy>=1.24.0
-pandas>=2.0.0
-python-dotenv>=1.0.0

+# Core ML/AI packages with compatible versions
+torch==2.1.0
+transformers==4.35.0
+accelerate==0.24.0
+peft==0.6.0
+datasets==2.14.0
+scikit-learn==1.3.0
+# Hugging Face Hub - use compatible version
+huggingface-hub>=0.16.4,<0.19.0
+# Web framework
+gradio==4.44.0
+flask==3.0.0
+flask-cors==4.0.0
+# Utilities
+requests==2.31.0
+numpy==1.24.3
+# Additional dependencies for Hugging Face Spaces
+aiofiles>=22.0
+fastapi>=0.100.0
+uvicorn>=0.20.0
+python-multipart>=0.0.9
+# For tokenizers compatibility
+tokenizers>=0.14.0,<0.15.0
+# For datasets compatibility
+pyarrow>=8.0.0
+dill>=0.3.0
+xxhash
+multiprocess
+# For accelerate
+psutil
+# For scikit-learn
+scipy>=1.5.0
+joblib>=1.1.1
+threadpoolctl>=2.0.0

setup_hf_space.py ADDED Viewed

	@@ -0,0 +1,92 @@

+#!/usr/bin/env python3
+"""
+Setup script for Hugging Face Spaces deployment
+"""
+import os
+import sys
+import subprocess
+from pathlib import Path
+def check_requirements():
+    """Check if all requirements are met"""
+    print("🔍 Checking requirements...")
+    # Check git
+    try:
+        subprocess.run(["git", "--version"], capture_output=True, check=True)
+        print("✅ Git available")
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        print("❌ Git not found. Please install git.")
+        return False
+    # Check if we're in the right directory
+    if not Path("scripts").exists():
+        print("❌ Scripts directory not found. Please run from project root.")
+        return False
+    print("✅ All requirements met")
+    return True
+def create_space_repository():
+    """Guide user to create HF Space repository"""
+    print("\n🌐 Creating Hugging Face Space Repository")
+    print("=" * 50)
+    print("1. Go to https://huggingface.co/spaces")
+    print("2. Click 'Create new Space'")
+    print("3. Fill in the details:")
+    print("   - Name: textilindo-ai-assistant")
+    print("   - SDK: Docker")
+    print("   - Hardware: GPU Basic")
+    print("   - Visibility: Public or Private")
+    print("4. Click 'Create Space'")
+    print("5. Copy the repository URL (e.g., https://huggingface.co/spaces/your-username/textilindo-ai-assistant)")
+    return input("\n📋 Enter your Hugging Face Space URL: ").strip()
+def setup_git_remote(space_url):
+    """Setup git remote for the space"""
+    print(f"\n🔗 Setting up git remote...")
+    try:
+        # Remove existing remote if any
+        subprocess.run(["git", "remote", "remove", "origin"], capture_output=True)
+        # Add new remote
+        subprocess.run(["git", "remote", "add", "origin", space_url], check=True)
+        print(f"✅ Remote added: {space_url}")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"❌ Failed to add remote: {e}")
+        return False
+def main():
+    """Main setup function"""
+    print("🚀 Textilindo AI Assistant - Hugging Face Spaces Setup")
+    print("=" * 60)
+    # Check requirements
+    if not check_requirements():
+        sys.exit(1)
+    # Guide user to create space
+    space_url = create_space_repository()
+    if not space_url:
+        print("❌ No space URL provided. Exiting.")
+        sys.exit(1)
+    # Setup git remote
+    if not setup_git_remote(space_url):
+        sys.exit(1)
+    print("\n✅ Setup complete!")
+    print("\n📋 Next steps:")
+    print("1. Run: python push_to_hf_space.py")
+    print("2. Check your Hugging Face Space")
+    print("3. Monitor the build logs")
+    print("4. Test the application when ready")
+if __name__ == "__main__":
+    main()

test_build.py ADDED Viewed

	@@ -0,0 +1,155 @@

+#!/usr/bin/env python3
+"""
+Test script to verify the build configuration
+"""
+import sys
+import subprocess
+from pathlib import Path
+def test_imports():
+    """Test if all required packages can be imported"""
+    print("🔍 Testing package imports...")
+    required_packages = [
+        "torch",
+        "transformers",
+        "accelerate",
+        "peft",
+        "datasets",
+        "gradio",
+        "flask",
+        "requests",
+        "numpy",
+        "sklearn"
+    ]
+    failed_imports = []
+    for package in required_packages:
+        try:
+            __import__(package)
+            print(f"✅ {package}")
+        except ImportError as e:
+            print(f"❌ {package}: {e}")
+            failed_imports.append(package)
+    return len(failed_imports) == 0
+def test_scripts():
+    """Test if scripts can be found and are executable"""
+    print("\n🔍 Testing scripts...")
+    scripts_dir = Path("scripts")
+    if not scripts_dir.exists():
+        print("❌ Scripts directory not found")
+        return False
+    required_scripts = [
+        "check_training_ready.py",
+        "create_sample_dataset.py",
+        "setup_textilindo_training.py",
+        "train_textilindo_ai.py",
+        "test_textilindo_ai.py"
+    ]
+    missing_scripts = []
+    for script in required_scripts:
+        script_path = scripts_dir / script
+        if script_path.exists():
+            print(f"✅ {script}")
+        else:
+            print(f"❌ {script} not found")
+            missing_scripts.append(script)
+    return len(missing_scripts) == 0
+def test_config_files():
+    """Test if configuration files exist"""
+    print("\n🔍 Testing configuration files...")
+    config_files = [
+        "requirements.txt",
+        "app.py",
+        "app_hf_spaces.py",
+        "health_check.py"
+    ]
+    missing_files = []
+    for config_file in config_files:
+        if Path(config_file).exists():
+            print(f"✅ {config_file}")
+        else:
+            print(f"❌ {config_file} not found")
+            missing_files.append(config_file)
+    return len(missing_files) == 0
+def test_script_execution():
+    """Test if a simple script can be executed"""
+    print("\n🔍 Testing script execution...")
+    try:
+        # Test the check_training_ready script
+        result = subprocess.run([
+            sys.executable, "scripts/check_training_ready.py"
+        ], capture_output=True, text=True, timeout=30)
+        if result.returncode == 0:
+            print("✅ Script execution successful")
+            return True
+        else:
+            print(f"❌ Script execution failed: {result.stderr}")
+            return False
+    except subprocess.TimeoutExpired:
+        print("❌ Script execution timed out")
+        return False
+    except Exception as e:
+        print(f"❌ Script execution error: {e}")
+        return False
+def main():
+    """Main test function"""
+    print("🧪 Textilindo AI Assistant - Build Test")
+    print("=" * 50)
+    tests = [
+        ("Package Imports", test_imports),
+        ("Scripts Availability", test_scripts),
+        ("Configuration Files", test_config_files),
+        ("Script Execution", test_script_execution)
+    ]
+    results = []
+    for test_name, test_func in tests:
+        print(f"\n📋 {test_name}")
+        print("-" * 30)
+        result = test_func()
+        results.append((test_name, result))
+    print("\n" + "=" * 50)
+    print("📊 Test Results:")
+    all_passed = True
+    for test_name, result in results:
+        status = "✅ PASS" if result else "❌ FAIL"
+        print(f"{status} {test_name}")
+        if not result:
+            all_passed = False
+    if all_passed:
+        print("\n🎉 All tests passed! Build configuration is ready.")
+        print("\n📋 Next steps:")
+        print("1. Push to Hugging Face Space")
+        print("2. Set environment variables")
+        print("3. Deploy and test")
+    else:
+        print("\n❌ Some tests failed. Please fix the issues above.")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()