harismlnaslm commited on
Commit
298d6c1
·
1 Parent(s): 4669d04

Initial commit: Textilindo AI Assistant for HF Spaces

Browse files
.gitattributes ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Large model files
2
+ *.bin filter=lfs diff=lfs merge=lfs -text
3
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
4
+ *.pt filter=lfs diff=lfs merge=lfs -text
5
+ *.pth filter=lfs diff=lfs merge=lfs -text
6
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
7
+ *.pkl filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.hdf5 filter=lfs diff=lfs merge=lfs -text
10
+ *.tar.gz filter=lfs diff=lfs merge=lfs -text
11
+ *.zip filter=lfs diff=lfs merge=lfs -text
.gitignore CHANGED
@@ -1,8 +1,3 @@
1
- # Virtual Environment
2
- venv/
3
- env/
4
- ENV/
5
-
6
  # Python
7
  __pycache__/
8
  *.py[cod]
@@ -24,63 +19,60 @@ wheels/
24
  *.egg-info/
25
  .installed.cfg
26
  *.egg
27
- MANIFEST
28
-
29
- # Model files (too large for git)
30
- models/
31
- *.bin
32
- *.safetensors
33
- *.ckpt
34
- *.pt
35
- *.pth
36
 
37
- # Data files
38
- data/*.jsonl
39
- data/*.json
40
- data/*.csv
41
- data/*.txt
42
-
43
- # Logs
44
- logs/
45
- *.log
46
- *.out
47
-
48
- # Environment variables
49
- .env
50
- .env.local
51
- .env.production
52
 
53
  # IDE
54
  .vscode/
55
  .idea/
56
  *.swp
57
  *.swo
58
- *~
59
 
60
  # OS
61
  .DS_Store
62
  Thumbs.db
 
 
63
 
64
- # Jupyter Notebook
65
- .ipynb_checkpoints
66
-
67
- # PyTorch
68
- *.pkl
69
- *.pickle
 
70
 
71
- # HuggingFace
72
  .cache/
73
- huggingface/
74
-
75
- # Docker
76
- .dockerignore
77
 
78
- # Temporary files
79
- tmp/
80
- temp/
81
- *.tmp
82
- *.temp
83
 
 
 
 
 
 
84
 
 
 
85
 
 
 
 
86
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Python
2
  __pycache__/
3
  *.py[cod]
 
19
  *.egg-info/
20
  .installed.cfg
21
  *.egg
 
 
 
 
 
 
 
 
 
22
 
23
+ # Virtual environments
24
+ venv/
25
+ env/
26
+ ENV/
27
+ .venv/
 
 
 
 
 
 
 
 
 
 
28
 
29
  # IDE
30
  .vscode/
31
  .idea/
32
  *.swp
33
  *.swo
34
+ *.sublime-*
35
 
36
  # OS
37
  .DS_Store
38
  Thumbs.db
39
+ *.tmp
40
+ *.temp
41
 
42
+ # Model files (use LFS for these)
43
+ models/
44
+ *.bin
45
+ *.safetensors
46
+ *.pt
47
+ *.pth
48
+ *.ckpt
49
 
50
+ # Cache
51
  .cache/
52
+ __pycache__/
53
+ transformers_cache/
54
+ .huggingface/
 
55
 
56
+ # Logs
57
+ *.log
58
+ logs/
59
+ wandb/
 
60
 
61
+ # Environment variables
62
+ .env
63
+ .env.local
64
+ .env.production
65
+ .env.staging
66
 
67
+ # Jupyter
68
+ .ipynb_checkpoints/
69
 
70
+ # PyTorch
71
+ *.pth
72
+ *.pt
73
 
74
+ # Data files (use LFS for large ones)
75
+ data/*.jsonl
76
+ data/*.json
77
+ data/*.csv
78
+ data/*.parquet
DEPLOYMENT_GUIDE.md ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Hugging Face Spaces Deployment Guide
2
+
3
+ ## Quick Start
4
+
5
+ ### Option 1: Automated Setup (Recommended)
6
+ ```bash
7
+ # 1. Setup the repository
8
+ python setup_hf_space.py
9
+
10
+ # 2. Push to Hugging Face Spaces
11
+ python push_to_hf_space.py
12
+ ```
13
+
14
+ ### Option 2: Manual Setup
15
+ Follow the steps below for manual deployment.
16
+
17
+ ## 📋 Prerequisites
18
+
19
+ 1. **Git installed** on your system
20
+ 2. **Hugging Face account** (free)
21
+ 3. **Python 3.8+** installed
22
+ 4. **All project files** in the current directory
23
+
24
+ ## 🌐 Step 1: Create Hugging Face Space
25
+
26
+ 1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
27
+ 2. Click **"Create new Space"**
28
+ 3. Fill in the details:
29
+ - **Name**: `textilindo-ai-assistant`
30
+ - **SDK**: **Docker**
31
+ - **Hardware**: **GPU Basic** (recommended) or **CPU Basic**
32
+ - **Visibility**: Public or Private
33
+ 4. Click **"Create Space"**
34
+ 5. **Copy the repository URL** (e.g., `https://huggingface.co/spaces/your-username/textilindo-ai-assistant`)
35
+
36
+ ## 🔧 Step 2: Setup Git Repository
37
+
38
+ ```bash
39
+ # Initialize git repository (if not already done)
40
+ git init
41
+
42
+ # Add your Hugging Face Space as remote
43
+ git remote add origin https://huggingface.co/spaces/your-username/textilindo-ai-assistant
44
+
45
+ # Verify remote
46
+ git remote -v
47
+ ```
48
+
49
+ ## 📁 Step 3: Prepare Files
50
+
51
+ The following files are already prepared for you:
52
+
53
+ - ✅ `README.md` - Space configuration
54
+ - ✅ `Dockerfile` - Optimized for HF Spaces
55
+ - ✅ `requirements.txt` - Fixed dependencies
56
+ - ✅ `app.py` - Main entry point
57
+ - ✅ `app_hf_spaces.py` - Web interface
58
+ - ✅ `health_check.py` - Health monitoring
59
+ - ✅ All scripts in `scripts/` directory
60
+
61
+ ## 🚀 Step 4: Deploy to Hugging Face Spaces
62
+
63
+ ```bash
64
+ # Add all files to git
65
+ git add .
66
+
67
+ # Commit changes
68
+ git commit -m "Initial commit: Textilindo AI Assistant"
69
+
70
+ # Push to Hugging Face Spaces
71
+ git push origin main
72
+ ```
73
+
74
+ ## ⏳ Step 5: Monitor Build
75
+
76
+ 1. Go to your Hugging Face Space
77
+ 2. Check the **"Logs"** tab
78
+ 3. Wait for the build to complete (5-10 minutes)
79
+ 4. The Space will automatically start when ready
80
+
81
+ ## 🎯 Step 6: Test Your Space
82
+
83
+ 1. **Visit your Space URL**: `https://huggingface.co/spaces/your-username/textilindo-ai-assistant`
84
+ 2. **Check Health**: Visit `/health` endpoint
85
+ 3. **Test Interface**: Use the web interface to run scripts
86
+ 4. **Monitor Logs**: Check for any errors
87
+
88
+ ## ⚙️ Step 7: Configure Environment Variables (Optional)
89
+
90
+ In your Space settings, add:
91
+
92
+ - `HUGGINGFACE_TOKEN`: Your Hugging Face token (for model downloads)
93
+ - `NOVITA_API_KEY`: Your Novita AI API key (for external training)
94
+
95
+ ## 🔍 Troubleshooting
96
+
97
+ ### Build Failures
98
+ - Check the build logs in your Space
99
+ - Verify all files are present
100
+ - Ensure Dockerfile is in the root directory
101
+
102
+ ### Runtime Errors
103
+ - Check the Space logs
104
+ - Verify environment variables
105
+ - Test individual scripts
106
+
107
+ ### Memory Issues
108
+ - Use GPU Basic or higher hardware
109
+ - Consider using smaller models
110
+ - Check resource usage in logs
111
+
112
+ ## 📊 Expected Results
113
+
114
+ After successful deployment:
115
+
116
+ ✅ **Space builds** without errors
117
+ ✅ **Web interface** accessible
118
+ ✅ **Health endpoint** returns healthy status
119
+ ✅ **All scripts** executable via interface
120
+ ✅ **Training process** can be initiated
121
+
122
+ ## 🎉 Success!
123
+
124
+ Your Textilindo AI Assistant is now deployed on Hugging Face Spaces!
125
+
126
+ ### Features Available:
127
+ - 🤖 **AI Model Training** with LoRA
128
+ - 📊 **Dataset Creation** and management
129
+ - 🧪 **Model Testing** and inference
130
+ - 🔗 **External Service** integration
131
+ - 📱 **Web Interface** for all operations
132
+
133
+ ### Next Steps:
134
+ 1. **Test the interface** with sample data
135
+ 2. **Train your first model** using the web interface
136
+ 3. **Share your Space** with others
137
+ 4. **Monitor performance** and logs
138
+
139
+ ## 📞 Support
140
+
141
+ If you encounter issues:
142
+ 1. Check the Space logs
143
+ 2. Verify all files are present
144
+ 3. Test locally with `python test_build.py`
145
+ 4. Review the troubleshooting section above
146
+
147
+ ## 🔄 Updates
148
+
149
+ To update your Space:
150
+ 1. Make changes to your local files
151
+ 2. Commit and push: `git push origin main`
152
+ 3. The Space will automatically rebuild
153
+ 4. Check build logs for any issues
Dockerfile CHANGED
@@ -1,31 +1,55 @@
1
- FROM python:3.9
2
-
3
- # Create user
4
- RUN useradd -m -u 1000 user
5
 
6
  # Set working directory
7
  WORKDIR /app
8
 
9
- # Set Gradio server name to bind to 0.0.0.0 for external access
10
- ENV GRADIO_SERVER_NAME="0.0.0.0"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  # Copy requirements first for better caching
13
- COPY --chown=user ./requirements.txt requirements.txt
14
 
15
- # Install dependencies
16
- RUN pip install --no-cache-dir --upgrade -r requirements.txt
 
 
 
 
17
 
18
  # Create necessary directories
19
- RUN mkdir -p /app/data /app/templates /app/configs
20
 
21
- # Copy application files
22
- COPY --chown=user . /app
 
 
 
 
23
 
24
- # Switch to user
25
- USER user
26
 
27
  # Expose port
28
  EXPOSE 7860
29
 
 
 
 
 
30
  # Run the application
31
- CMD ["python", "app_gradio.py"]
 
1
+ FROM python:3.10-slim
 
 
 
2
 
3
  # Set working directory
4
  WORKDIR /app
5
 
6
+ # Install system dependencies
7
+ RUN apt-get update && apt-get install -y \
8
+ git \
9
+ git-lfs \
10
+ curl \
11
+ build-essential \
12
+ cmake \
13
+ libgl1-mesa-glx \
14
+ libglib2.0-0 \
15
+ libsm6 \
16
+ libxext6 \
17
+ libxrender-dev \
18
+ libgomp1 \
19
+ && rm -rf /var/lib/apt/lists/*
20
+
21
+ # Initialize git lfs
22
+ RUN git lfs install
23
 
24
  # Copy requirements first for better caching
25
+ COPY requirements_fixed.txt requirements.txt
26
 
27
+ # Install Python dependencies with specific versions to avoid conflicts
28
+ RUN pip install --no-cache-dir --upgrade pip && \
29
+ pip install --no-cache-dir -r requirements.txt
30
+
31
+ # Copy application files
32
+ COPY . .
33
 
34
  # Create necessary directories
35
+ RUN mkdir -p data configs models scripts logs
36
 
37
+ # Set environment variables
38
+ ENV PYTHONPATH=/app
39
+ ENV TRANSFORMERS_CACHE=/app/.cache/transformers
40
+ ENV HF_HOME=/app/.cache/huggingface
41
+ ENV GRADIO_SERVER_NAME="0.0.0.0"
42
+ ENV GRADIO_SERVER_PORT=7860
43
 
44
+ # Make scripts executable
45
+ RUN chmod +x scripts/*.py
46
 
47
  # Expose port
48
  EXPOSE 7860
49
 
50
+ # Health check
51
+ HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
52
+ CMD curl -f http://localhost:7860/health || exit 1
53
+
54
  # Run the application
55
+ CMD ["python", "app.py"]
Dockerfile_hf_spaces ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ # Set working directory
4
+ WORKDIR /app
5
+
6
+ # Install system dependencies
7
+ RUN apt-get update && apt-get install -y \
8
+ git \
9
+ git-lfs \
10
+ curl \
11
+ build-essential \
12
+ cmake \
13
+ libgl1-mesa-glx \
14
+ libglib2.0-0 \
15
+ libsm6 \
16
+ libxext6 \
17
+ libxrender-dev \
18
+ libgomp1 \
19
+ && rm -rf /var/lib/apt/lists/*
20
+
21
+ # Initialize git lfs
22
+ RUN git lfs install
23
+
24
+ # Copy requirements first for better caching
25
+ COPY requirements_fixed.txt requirements.txt
26
+
27
+ # Install Python dependencies with specific versions to avoid conflicts
28
+ RUN pip install --no-cache-dir --upgrade pip && \
29
+ pip install --no-cache-dir -r requirements.txt
30
+
31
+ # Copy application files
32
+ COPY . .
33
+
34
+ # Create necessary directories
35
+ RUN mkdir -p data configs models scripts logs
36
+
37
+ # Set environment variables
38
+ ENV PYTHONPATH=/app
39
+ ENV TRANSFORMERS_CACHE=/app/.cache/transformers
40
+ ENV HF_HOME=/app/.cache/huggingface
41
+ ENV GRADIO_SERVER_NAME="0.0.0.0"
42
+ ENV GRADIO_SERVER_PORT=7860
43
+
44
+ # Make scripts executable
45
+ RUN chmod +x scripts/*.py
46
+
47
+ # Expose port
48
+ EXPOSE 7860
49
+
50
+ # Health check
51
+ HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
52
+ CMD curl -f http://localhost:7860/health || exit 1
53
+
54
+ # Run the application
55
+ CMD ["python", "app.py"]
HF_SPACES_FIX_SUMMARY.md ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hugging Face Spaces Fix Summary
2
+
3
+ ## 🚨 Problem
4
+ The Hugging Face Space build was failing due to dependency conflicts, specifically with `huggingface-hub` version requirements between different packages.
5
+
6
+ ## ✅ Solution
7
+ Created a comprehensive fix that resolves all dependency conflicts and provides a complete Hugging Face Spaces deployment.
8
+
9
+ ## 📁 Files Created/Modified
10
+
11
+ ### New Files:
12
+ 1. **`requirements_fixed.txt`** - Fixed dependency versions
13
+ 2. **`Dockerfile_hf_spaces`** - Optimized Dockerfile for HF Spaces
14
+ 3. **`app_hf_spaces.py`** - Main Gradio interface for HF Spaces
15
+ 4. **`app.py`** - Main entry point
16
+ 5. **`health_check.py`** - Health check endpoint
17
+ 6. **`deploy_to_hf_space.py`** - Deployment helper script
18
+ 7. **`test_build.py`** - Build verification script
19
+ 8. **`README_HF_SPACES.md`** - HF Spaces specific documentation
20
+
21
+ ### Modified Files:
22
+ 1. **`requirements.txt`** - Updated with compatible versions
23
+
24
+ ## 🔧 Key Changes
25
+
26
+ ### 1. Dependency Resolution
27
+ - **Fixed `huggingface-hub` version**: `>=0.16.4,<0.19.0`
28
+ - **Compatible tokenizers**: `>=0.14.0,<0.15.0`
29
+ - **Added missing dependencies**: `aiofiles`, `fastapi`, `uvicorn`, etc.
30
+
31
+ ### 2. Dockerfile Optimization
32
+ - **Base image**: `python:3.10-slim`
33
+ - **System dependencies**: Added all required packages
34
+ - **Script permissions**: Made all scripts executable
35
+ - **Health check**: Added health check endpoint
36
+ - **Environment variables**: Set proper paths and ports
37
+
38
+ ### 3. Application Structure
39
+ - **Main entry point**: `app.py` detects HF Space vs local
40
+ - **Gradio interface**: `app_hf_spaces.py` provides web UI
41
+ - **Health endpoint**: `/health` for monitoring
42
+ - **Script runner**: All scripts accessible via web interface
43
+
44
+ ## 🚀 Deployment Steps
45
+
46
+ ### 1. Prepare Repository
47
+ ```bash
48
+ # Run deployment preparation
49
+ python deploy_to_hf_space.py
50
+
51
+ # Test the build
52
+ python test_build.py
53
+ ```
54
+
55
+ ### 2. Create Hugging Face Space
56
+ 1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
57
+ 2. Click "Create new Space"
58
+ 3. Choose **Docker** SDK
59
+ 4. Set hardware to **GPU Basic** or higher
60
+ 5. Connect your repository
61
+
62
+ ### 3. Configure Space
63
+ - **Dockerfile**: Use `Dockerfile_hf_spaces`
64
+ - **Requirements**: Use `requirements.txt` (already fixed)
65
+ - **Environment Variables** (optional):
66
+ - `HUGGINGFACE_TOKEN`: Your HF token
67
+ - `NOVITA_API_KEY`: Your Novita AI key
68
+
69
+ ### 4. Deploy
70
+ - Push your code to the repository
71
+ - The Space will automatically build
72
+ - Monitor build logs for any issues
73
+
74
+ ## 🎯 Features Available
75
+
76
+ ### Web Interface
77
+ - **Setup & Training Tab**: All training scripts
78
+ - **External Services Tab**: Novita AI integration
79
+ - **Scripts Info Tab**: List all available scripts
80
+
81
+ ### Available Scripts
82
+ 1. **`check_training_ready.py`** - Verify setup
83
+ 2. **`create_sample_dataset.py`** - Generate training data
84
+ 3. **`setup_textilindo_training.py`** - Download models
85
+ 4. **`train_textilindo_ai.py`** - Train the model
86
+ 5. **`test_textilindo_ai.py`** - Test the model
87
+ 6. **`test_novita_connection.py`** - Test external services
88
+
89
+ ### Health Monitoring
90
+ - **Endpoint**: `/health`
91
+ - **Status**: System health, script count, directories
92
+ - **Logs**: Available in Space logs
93
+
94
+ ## 🔍 Troubleshooting
95
+
96
+ ### Common Issues:
97
+ 1. **Build Failures**: Check dependency versions
98
+ 2. **Memory Issues**: Use GPU Basic or higher
99
+ 3. **Script Errors**: Check Space logs
100
+ 4. **Model Download**: Ensure HF token is set
101
+
102
+ ### Debug Steps:
103
+ 1. Check `/health` endpoint
104
+ 2. Review Space logs
105
+ 3. Test individual scripts
106
+ 4. Verify environment variables
107
+
108
+ ## 📊 Expected Results
109
+
110
+ After successful deployment:
111
+ - ✅ All dependencies installed without conflicts
112
+ - ✅ Web interface accessible at Space URL
113
+ - ✅ All scripts executable via interface
114
+ - ✅ Health check endpoint working
115
+ - ✅ Ready for AI model training
116
+
117
+ ## 🎉 Success Criteria
118
+
119
+ The fix is successful when:
120
+ 1. **Build completes** without dependency errors
121
+ 2. **Space starts** and shows web interface
122
+ 3. **Health check** returns healthy status
123
+ 4. **Scripts can be executed** via web interface
124
+ 5. **Training process** can be initiated
125
+
126
+ ## 📞 Support
127
+
128
+ If issues persist:
129
+ 1. Check Space build logs
130
+ 2. Verify all files are present
131
+ 3. Test locally with `test_build.py`
132
+ 4. Review dependency versions
133
+ 5. Check Hugging Face documentation
QUICK_DEPLOY.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Quick Deploy to Hugging Face Spaces
2
+
3
+ ## Prerequisites
4
+ - Python 3.8+
5
+ - Git installed
6
+ - Hugging Face account
7
+
8
+ ## Step 1: Setup
9
+ ```bash
10
+ # Install requirements
11
+ pip install -r requirements.txt
12
+
13
+ # Run setup script
14
+ python setup_hf_space.py
15
+ ```
16
+
17
+ ## Step 2: Deploy
18
+ ```bash
19
+ # Option A: Automated deployment
20
+ python deploy_final.py
21
+
22
+ # Option B: Manual deployment
23
+ huggingface-cli login
24
+ huggingface-cli repo create textilindo-ai-assistant --type space --sdk gradio
25
+ ```
26
+
27
+ ## Step 3: Manual Upload (if automated fails)
28
+ 1. Go to https://huggingface.co/spaces/[your-username]/textilindo-ai-assistant
29
+ 2. Upload these files:
30
+ - `app.py`
31
+ - `requirements.txt`
32
+ - `README.md`
33
+ - `configs/system_prompt.md`
34
+ - `data/textilindo_training_data.jsonl`
35
+
36
+ ## Step 4: Test
37
+ - Wait for build to complete (2-5 minutes)
38
+ - Test your application
39
+ - Share the link!
40
+
41
+ ## Files Structure
42
+ ```
43
+ textilindo-ai-assistant/
44
+ ├── app.py # Main Gradio application
45
+ ├── requirements.txt # Dependencies
46
+ ├── README.md # Space configuration
47
+ ├── configs/
48
+ │ └── system_prompt.md # System prompt
49
+ └── data/
50
+ └── textilindo_training_data.jsonl
51
+ ```
52
+
53
+ ## Troubleshooting
54
+ - **Build fails**: Check requirements.txt versions
55
+ - **App doesn't start**: Check app.py for errors
56
+ - **Data not loading**: Verify data files are uploaded
57
+ - **Memory issues**: Use smaller dataset or optimize code
58
+
59
+ ## Support
60
+ Check the DEPLOYMENT_GUIDE.md for detailed instructions.
README.md CHANGED
@@ -6,50 +6,43 @@ colorTo: purple
6
  sdk: docker
7
  pinned: false
8
  license: mit
9
- app_port: 8080
 
10
  ---
11
 
12
  # Textilindo AI Assistant
13
 
14
- AI-powered customer service assistant for Textilindo textile company.
15
 
16
  ## Features
17
 
18
- - 🤖 **Smart AI Assistant**: Answers customer questions about products, shipping, and company policies
19
- - 📚 **Knowledge Base**: Uses 182+ training examples for context-aware responses
20
- - 🇮🇩 **Indonesian Language**: Responds in friendly Indonesian language
21
- - 🛍️ **Sales Focus**: Helps customers with product recommendations and ordering
 
22
 
23
- ## API Endpoints
24
 
25
- - `GET /` - API documentation
26
- - `GET /health` - Health check
27
- - `POST /chat` - Chat with AI
28
- - `GET /stats` - Dataset statistics
 
29
 
30
- ## Usage
31
 
32
- Send a POST request to `/chat` with your message:
 
 
33
 
34
- ```json
35
- {
36
- "message": "dimana lokasi textilindo?",
37
- "max_tokens": 300,
38
- "temperature": 0.7
39
- }
40
- ```
41
 
42
- ## Dataset
43
 
44
- The assistant is trained on 182+ examples covering:
45
- - Company location and hours
46
- - Product information
47
- - Shipping and payment policies
48
- - Customer service scenarios
49
 
50
- ## Technology
51
 
52
- - **Backend**: Flask (Python)
53
- - **AI**: Hugging Face Transformers
54
- - **Data**: JSONL format with RAG (Retrieval-Augmented Generation)
55
- - **Deployment**: Hugging Face Spaces
 
6
  sdk: docker
7
  pinned: false
8
  license: mit
9
+ app_port: 7860
10
+ hardware: gpu-basic
11
  ---
12
 
13
  # Textilindo AI Assistant
14
 
15
+ AI Assistant for Textilindo with training and inference capabilities.
16
 
17
  ## Features
18
 
19
+ - 🤖 AI model training with LoRA
20
+ - 📊 Dataset creation and management
21
+ - 🧪 Model testing and inference
22
+ - 🔗 External service integration
23
+ - 📱 Web interface for all operations
24
 
25
+ ## Usage
26
 
27
+ 1. **Check Training Ready**: Verify all components are ready
28
+ 2. **Create Dataset**: Generate sample training data
29
+ 3. **Setup Training**: Download models and setup environment
30
+ 4. **Train Model**: Start the training process
31
+ 5. **Test Model**: Interact with the trained model
32
 
33
+ ## Hardware Requirements
34
 
35
+ - **Minimum**: CPU Basic (2 vCPU, 8GB RAM)
36
+ - **Recommended**: GPU Basic (1 T4 GPU, 16GB RAM)
37
+ - **For Training**: GPU A10G or higher
38
 
39
+ ## Environment Variables
 
 
 
 
 
 
40
 
41
+ Set these in your Space settings:
42
 
43
+ - `HUGGINGFACE_TOKEN`: Your Hugging Face token (optional)
44
+ - `NOVITA_API_KEY`: Your Novita AI API key (optional)
 
 
 
45
 
46
+ ## Support
47
 
48
+ For issues and questions, check the logs and health endpoint.
 
 
 
README_HF_SPACES.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Textilindo AI Assistant - Hugging Face Spaces
2
+
3
+ This is the Hugging Face Spaces deployment version of the Textilindo AI Assistant.
4
+
5
+ ## 🚀 Quick Start
6
+
7
+ 1. **Fork this repository**
8
+ 2. **Create a new Hugging Face Space**
9
+ 3. **Use the following settings:**
10
+ - **SDK**: Docker
11
+ - **Hardware**: CPU Basic (or GPU if available)
12
+ - **Visibility**: Public or Private
13
+
14
+ ## 📁 Files Structure
15
+
16
+ ```
17
+ textilindo-ai-inference/
18
+ ├── app.py # Main entry point
19
+ ├── app_hf_spaces.py # HF Spaces specific app
20
+ ├── health_check.py # Health check endpoint
21
+ ├── Dockerfile_hf_spaces # Optimized Dockerfile for HF Spaces
22
+ ├── requirements_fixed.txt # Fixed dependencies
23
+ ├── scripts/ # All training and utility scripts
24
+ │ ├── check_training_ready.py
25
+ │ ├── create_sample_dataset.py
26
+ │ ├── setup_textilindo_training.py
27
+ │ ├── train_textilindo_ai.py
28
+ │ ├── test_textilindo_ai.py
29
+ │ └── ... (all other scripts)
30
+ ├── configs/ # Configuration files
31
+ ├── data/ # Training data
32
+ └── models/ # Model storage
33
+ ```
34
+
35
+ ## 🔧 Configuration
36
+
37
+ ### Environment Variables
38
+
39
+ Set these in your Hugging Face Space settings:
40
+
41
+ ```bash
42
+ # Optional: Hugging Face Hub token for model downloads
43
+ HUGGINGFACE_TOKEN=your_token_here
44
+
45
+ # Optional: Novita AI API key for external training
46
+ NOVITA_API_KEY=your_novita_key_here
47
+
48
+ # Python path
49
+ PYTHONPATH=/app
50
+ ```
51
+
52
+ ### Hardware Requirements
53
+
54
+ - **Minimum**: CPU Basic (2 vCPU, 8GB RAM)
55
+ - **Recommended**: GPU Basic (1 T4 GPU, 16GB RAM)
56
+ - **For Training**: GPU A10G or higher
57
+
58
+ ## 🎯 Features
59
+
60
+ ### Available Scripts
61
+
62
+ The interface provides access to all training and utility scripts:
63
+
64
+ 1. **Setup & Training**
65
+ - `check_training_ready.py` - Verify all components are ready
66
+ - `setup_textilindo_training.py` - Download models and setup environment
67
+ - `train_textilindo_ai.py` - Train the AI model with LoRA
68
+ - `create_sample_dataset.py` - Create sample training data
69
+
70
+ 2. **Testing & Inference**
71
+ - `test_textilindo_ai.py` - Test the trained model
72
+ - `inference_textilindo_ai.py` - Run inference with the model
73
+
74
+ 3. **External Services**
75
+ - `test_novita_connection.py` - Test Novita AI connection
76
+ - `novita_ai_setup.py` - Setup Novita AI integration
77
+
78
+ ## 🚀 Usage
79
+
80
+ 1. **Access the Space**: Visit your deployed Hugging Face Space
81
+ 2. **Check Status**: Use the "Check Training Ready" button to verify setup
82
+ 3. **Create Dataset**: Use "Create Sample Dataset" to generate training data
83
+ 4. **Setup Training**: Use "Setup Training" to download models
84
+ 5. **Train Model**: Use "Train Model" to start the training process
85
+ 6. **Test Model**: Use "Test Model" to interact with the trained model
86
+
87
+ ## 📊 Training Process
88
+
89
+ ### Step 1: Check Readiness
90
+ ```bash
91
+ python scripts/check_training_ready.py
92
+ ```
93
+
94
+ ### Step 2: Create Dataset
95
+ ```bash
96
+ python scripts/create_sample_dataset.py
97
+ ```
98
+
99
+ ### Step 3: Setup Training
100
+ ```bash
101
+ python scripts/setup_textilindo_training.py
102
+ ```
103
+
104
+ ### Step 4: Train Model
105
+ ```bash
106
+ python scripts/train_textilindo_ai.py
107
+ ```
108
+
109
+ ### Step 5: Test Model
110
+ ```bash
111
+ python scripts/test_textilindo_ai.py
112
+ ```
113
+
114
+ ## 🔍 Troubleshooting
115
+
116
+ ### Common Issues
117
+
118
+ 1. **Dependency Conflicts**
119
+ - Use `requirements_fixed.txt` instead of `requirements.txt`
120
+ - The fixed version resolves huggingface-hub conflicts
121
+
122
+ 2. **Memory Issues**
123
+ - Use CPU Basic for inference only
124
+ - Use GPU Basic or higher for training
125
+ - Consider using smaller models for limited resources
126
+
127
+ 3. **Script Execution**
128
+ - All scripts are made executable in the Dockerfile
129
+ - Check the output logs for detailed error messages
130
+
131
+ 4. **Model Download**
132
+ - Ensure you have a valid HUGGINGFACE_TOKEN
133
+ - Some models may require authentication
134
+
135
+ ### Health Check
136
+
137
+ Visit `/health` endpoint to check the application status:
138
+
139
+ ```bash
140
+ curl https://your-space-name.hf.space/health
141
+ ```
142
+
143
+ ## 📝 Notes
144
+
145
+ - **Training Time**: Training can take 1-3 hours depending on hardware
146
+ - **Storage**: Models and data are stored in the Space's persistent storage
147
+ - **Logs**: Check the Space logs for detailed execution information
148
+ - **Restart**: You may need to restart the Space after training
149
+
150
+ ## 🆘 Support
151
+
152
+ If you encounter issues:
153
+
154
+ 1. Check the Space logs
155
+ 2. Verify all environment variables are set
156
+ 3. Ensure you have sufficient hardware resources
157
+ 4. Check the health endpoint for system status
158
+
159
+ ## 🔄 Updates
160
+
161
+ To update the Space:
162
+
163
+ 1. Push changes to your repository
164
+ 2. The Space will automatically rebuild
165
+ 3. Check the build logs for any issues
166
+ 4. Restart if necessary
Textilindo-2 CHANGED
@@ -1 +1 @@
1
- Subproject commit 741eedf932ff96044ac7a399537c7a077316669a
 
1
+ Subproject commit 60664bc394261bea116661f76431aaafd5f5eaab
app.py CHANGED
@@ -1,285 +1,33 @@
1
  #!/usr/bin/env python3
2
  """
3
- Textilindo AI Assistant - Hugging Face Spaces
4
  """
5
 
6
- import gradio as gr
7
  import os
8
- import json
9
- import requests
10
- from difflib import SequenceMatcher
11
- import logging
12
 
13
- # Setup logging
14
- logging.basicConfig(level=logging.INFO)
15
- logger = logging.getLogger(__name__)
16
-
17
- def load_system_prompt(default_text):
18
- """Load system prompt from configs/system_prompt.md if available"""
19
- try:
20
- base_dir = os.path.dirname(__file__)
21
- md_path = os.path.join(base_dir, 'configs', 'system_prompt.md')
22
- if not os.path.exists(md_path):
23
- return default_text
24
- with open(md_path, 'r', encoding='utf-8') as f:
25
- content = f.read()
26
- start = content.find('"""')
27
- end = content.rfind('"""')
28
- if start != -1 and end != -1 and end > start:
29
- return content[start+3:end].strip()
30
- lines = []
31
- for line in content.splitlines():
32
- if line.strip().startswith('#'):
33
- continue
34
- lines.append(line)
35
- cleaned = '\n'.join(lines).strip()
36
- return cleaned or default_text
37
- except Exception:
38
- return default_text
39
-
40
- class TextilindoAI:
41
- def __init__(self):
42
- self.system_prompt = os.getenv(
43
- 'SYSTEM_PROMPT',
44
- load_system_prompt("You are Textilindo AI Assistant. Be concise, helpful, and use Indonesian.")
45
- )
46
- self.dataset = self.load_all_datasets()
47
-
48
- def load_all_datasets(self):
49
- """Load all available datasets"""
50
- dataset = []
51
-
52
- # Try multiple possible data directory paths
53
- possible_data_dirs = [
54
- "data",
55
- "./data",
56
- "/app/data",
57
- os.path.join(os.path.dirname(__file__), "data")
58
- ]
59
-
60
- data_dir = None
61
- for dir_path in possible_data_dirs:
62
- if os.path.exists(dir_path):
63
- data_dir = dir_path
64
- logger.info(f"Found data directory: {data_dir}")
65
- break
66
-
67
- if not data_dir:
68
- logger.warning("No data directory found in any of the expected locations")
69
- return dataset
70
-
71
- # Load all JSONL files
72
- try:
73
- for filename in os.listdir(data_dir):
74
- if filename.endswith('.jsonl'):
75
- filepath = os.path.join(data_dir, filename)
76
- try:
77
- with open(filepath, 'r', encoding='utf-8') as f:
78
- for line_num, line in enumerate(f, 1):
79
- line = line.strip()
80
- if line:
81
- try:
82
- data = json.loads(line)
83
- dataset.append(data)
84
- except json.JSONDecodeError as e:
85
- logger.warning(f"Invalid JSON in {filename} line {line_num}: {e}")
86
- continue
87
- logger.info(f"Loaded {filename}: {len([d for d in dataset if d.get('instruction')])} examples")
88
- except Exception as e:
89
- logger.error(f"Error loading {filename}: {e}")
90
- except Exception as e:
91
- logger.error(f"Error reading data directory {data_dir}: {e}")
92
-
93
- logger.info(f"Total examples loaded: {len(dataset)}")
94
- return dataset
95
-
96
- def find_relevant_context(self, user_query, top_k=3):
97
- """Find most relevant examples from dataset"""
98
- if not self.dataset:
99
- return []
100
-
101
- scores = []
102
- for i, example in enumerate(self.dataset):
103
- instruction = example.get('instruction', '').lower()
104
- output = example.get('output', '').lower()
105
- query = user_query.lower()
106
-
107
- instruction_score = SequenceMatcher(None, query, instruction).ratio()
108
- output_score = SequenceMatcher(None, query, output).ratio()
109
- combined_score = (instruction_score * 0.7) + (output_score * 0.3)
110
- scores.append((combined_score, i))
111
-
112
- scores.sort(reverse=True)
113
- relevant_examples = []
114
-
115
- for score, idx in scores[:top_k]:
116
- if score > 0.1:
117
- relevant_examples.append(self.dataset[idx])
118
-
119
- return relevant_examples
120
-
121
- def create_context_prompt(self, user_query, relevant_examples):
122
- """Create a prompt with relevant context"""
123
- if not relevant_examples:
124
- return user_query
125
-
126
- context_parts = []
127
- context_parts.append("Berikut adalah beberapa contoh pertanyaan dan jawaban tentang Textilindo:")
128
- context_parts.append("")
129
-
130
- for i, example in enumerate(relevant_examples, 1):
131
- instruction = example.get('instruction', '')
132
- output = example.get('output', '')
133
- context_parts.append(f"Contoh {i}:")
134
- context_parts.append(f"Pertanyaan: {instruction}")
135
- context_parts.append(f"Jawaban: {output}")
136
- context_parts.append("")
137
-
138
- context_parts.append("Berdasarkan contoh di atas, jawab pertanyaan berikut:")
139
- context_parts.append(f"Pertanyaan: {user_query}")
140
- context_parts.append("Jawaban:")
141
-
142
- return "\n".join(context_parts)
143
-
144
- def chat(self, message, max_tokens=300, temperature=0.7):
145
- """Generate response using Hugging Face Spaces"""
146
- relevant_examples = self.find_relevant_context(message, 3)
147
-
148
- if relevant_examples:
149
- enhanced_prompt = self.create_context_prompt(message, relevant_examples)
150
- context_used = True
151
- else:
152
- enhanced_prompt = message
153
- context_used = False
154
-
155
- # For now, return a simple response
156
- # In production, this would call your HF Space inference endpoint
157
- response = f"Terima kasih atas pertanyaan Anda: {message}. Saya akan membantu Anda dengan informasi tentang Textilindo."
158
-
159
- return {
160
- "success": True,
161
- "response": response,
162
- "context_used": context_used,
163
- "relevant_examples_count": len(relevant_examples)
164
- }
165
-
166
- # Initialize AI
167
- ai = TextilindoAI()
168
-
169
- @app.route('/health', methods=['GET'])
170
- def health_check():
171
- """Health check endpoint"""
172
- return jsonify({
173
- "status": "healthy",
174
- "service": "Textilindo AI Assistant",
175
- "dataset_loaded": len(ai.dataset) > 0,
176
- "dataset_size": len(ai.dataset)
177
- })
178
-
179
- @app.route('/chat', methods=['POST'])
180
- def chat():
181
- """Main chat endpoint"""
182
- try:
183
- data = request.get_json()
184
-
185
- if not data:
186
- return jsonify({
187
- "success": False,
188
- "error": "No JSON data provided"
189
- }), 400
190
-
191
- message = data.get('message', '').strip()
192
- if not message:
193
- return jsonify({
194
- "success": False,
195
- "error": "Message is required"
196
- }), 400
197
-
198
- # Optional parameters
199
- max_tokens = data.get('max_tokens', 300)
200
- temperature = data.get('temperature', 0.7)
201
-
202
- # Process chat
203
- result = ai.chat(message, max_tokens, temperature)
204
-
205
- if result["success"]:
206
- return jsonify(result)
207
- else:
208
- return jsonify(result), 500
209
-
210
- except Exception as e:
211
- logger.error(f"Error in chat endpoint: {e}")
212
- return jsonify({
213
- "success": False,
214
- "error": f"Internal server error: {str(e)}"
215
- }), 500
216
-
217
- @app.route('/stats', methods=['GET'])
218
- def get_stats():
219
- """Get dataset and system statistics"""
220
- try:
221
- topics = {}
222
- for example in ai.dataset:
223
- metadata = example.get('metadata', {})
224
- topic = metadata.get('topic', 'unknown')
225
- topics[topic] = topics.get(topic, 0) + 1
226
-
227
- return jsonify({
228
- "success": True,
229
- "dataset": {
230
- "total_examples": len(ai.dataset),
231
- "topics": topics,
232
- "topics_count": len(topics)
233
- },
234
- "system": {
235
- "api_version": "1.0.0",
236
- "status": "operational"
237
- }
238
- })
239
-
240
- except Exception as e:
241
- logger.error(f"Error in stats endpoint: {e}")
242
- return jsonify({
243
- "success": False,
244
- "error": f"Internal server error: {str(e)}"
245
- }), 500
246
-
247
- @app.route('/', methods=['GET'])
248
- def root():
249
- """API root endpoint with documentation"""
250
- return jsonify({
251
- "service": "Textilindo AI Assistant",
252
- "version": "1.0.0",
253
- "description": "AI-powered customer service for Textilindo",
254
- "endpoints": {
255
- "GET /": "API documentation (this endpoint)",
256
- "GET /health": "Health check",
257
- "POST /chat": "Chat with AI",
258
- "GET /stats": "Dataset and system statistics"
259
- },
260
- "usage": {
261
- "chat": {
262
- "method": "POST",
263
- "url": "/chat",
264
- "body": {
265
- "message": "string (required)",
266
- "max_tokens": "integer (optional, default: 300)",
267
- "temperature": "float (optional, default: 0.7)"
268
- }
269
- }
270
- },
271
- "dataset_size": len(ai.dataset)
272
- })
273
-
274
- if __name__ == '__main__':
275
- logger.info("Starting Textilindo AI Assistant...")
276
- logger.info(f"Dataset loaded: {len(ai.dataset)} examples")
277
-
278
- # For Hugging Face Spaces, use the PORT environment variable
279
- port = int(os.environ.get('PORT', 7860))
280
 
281
- app.run(
282
- debug=False,
283
- host='0.0.0.0',
284
- port=port
285
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  #!/usr/bin/env python3
2
  """
3
+ Main application entry point for Hugging Face Spaces
4
  """
5
 
 
6
  import os
7
+ import sys
8
+ from pathlib import Path
 
 
9
 
10
+ def main():
11
+ """Main entry point"""
12
+ print("🚀 Textilindo AI Assistant - Starting...")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
+ # Check if we're in a Hugging Face Space
15
+ if os.getenv("SPACE_ID"):
16
+ print("🌐 Running in Hugging Face Space")
17
+ # Import and run the HF Spaces app
18
+ try:
19
+ from app_hf_spaces import main as run_hf_app
20
+ run_hf_app()
21
+ except ImportError as e:
22
+ print(f"❌ Error importing HF app: {e}")
23
+ # Fallback to health check
24
+ from health_check import app
25
+ app.run(host="0.0.0.0", port=7860, debug=False)
26
+ else:
27
+ print("💻 Running locally")
28
+ # Run the health check server
29
+ from health_check import app
30
+ app.run(host="0.0.0.0", port=7860, debug=True)
31
+
32
+ if __name__ == "__main__":
33
+ main()
app_hf_spaces.py ADDED
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Hugging Face Spaces App for Textilindo AI Assistant
4
+ Main entry point that can run scripts and serve the application
5
+ """
6
+
7
+ import os
8
+ import sys
9
+ import subprocess
10
+ import gradio as gr
11
+ from pathlib import Path
12
+ import logging
13
+
14
+ # Setup logging
15
+ logging.basicConfig(level=logging.INFO)
16
+ logger = logging.getLogger(__name__)
17
+
18
+ def run_script(script_name, *args):
19
+ """Run a script from the scripts directory"""
20
+ script_path = Path("scripts") / f"{script_name}.py"
21
+
22
+ if not script_path.exists():
23
+ return f"❌ Script not found: {script_path}"
24
+
25
+ try:
26
+ # Run the script
27
+ cmd = [sys.executable, str(script_path)] + list(args)
28
+ result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
29
+
30
+ if result.returncode == 0:
31
+ return f"✅ Script executed successfully:\n{result.stdout}"
32
+ else:
33
+ return f"❌ Script failed:\n{result.stderr}"
34
+
35
+ except subprocess.TimeoutExpired:
36
+ return "❌ Script timed out after 5 minutes"
37
+ except Exception as e:
38
+ return f"❌ Error running script: {e}"
39
+
40
+ def check_training_ready():
41
+ """Check if everything is ready for training"""
42
+ return run_script("check_training_ready")
43
+
44
+ def create_sample_dataset():
45
+ """Create a sample dataset"""
46
+ return run_script("create_sample_dataset")
47
+
48
+ def test_novita_connection():
49
+ """Test Novita AI connection"""
50
+ return run_script("test_novita_connection")
51
+
52
+ def setup_textilindo_training():
53
+ """Setup Textilindo training environment"""
54
+ return run_script("setup_textilindo_training")
55
+
56
+ def train_textilindo_ai():
57
+ """Train Textilindo AI model"""
58
+ return run_script("train_textilindo_ai")
59
+
60
+ def test_textilindo_ai():
61
+ """Test the trained Textilindo AI model"""
62
+ return run_script("test_textilindo_ai")
63
+
64
+ def list_available_scripts():
65
+ """List all available scripts"""
66
+ scripts_dir = Path("scripts")
67
+ if not scripts_dir.exists():
68
+ return "❌ Scripts directory not found"
69
+
70
+ scripts = []
71
+ for script_file in scripts_dir.glob("*.py"):
72
+ if script_file.name != "__init__.py":
73
+ scripts.append(f"📄 {script_file.name}")
74
+
75
+ if scripts:
76
+ return "📋 Available Scripts:\n" + "\n".join(scripts)
77
+ else:
78
+ return "❌ No scripts found"
79
+
80
+ def create_interface():
81
+ """Create the Gradio interface"""
82
+
83
+ with gr.Blocks(title="Textilindo AI Assistant - Script Runner") as interface:
84
+ gr.Markdown("""
85
+ # 🤖 Textilindo AI Assistant - Script Runner
86
+
87
+ This interface allows you to run various scripts for the Textilindo AI Assistant.
88
+ """)
89
+
90
+ with gr.Tab("Setup & Training"):
91
+ gr.Markdown("### Setup and Training Scripts")
92
+
93
+ with gr.Row():
94
+ check_btn = gr.Button("🔍 Check Training Ready", variant="secondary")
95
+ setup_btn = gr.Button("⚙️ Setup Training", variant="primary")
96
+ train_btn = gr.Button("🚀 Train Model", variant="primary")
97
+
98
+ with gr.Row():
99
+ dataset_btn = gr.Button("📊 Create Sample Dataset", variant="secondary")
100
+ test_btn = gr.Button("🧪 Test Model", variant="secondary")
101
+
102
+ with gr.Tab("External Services"):
103
+ gr.Markdown("### External Service Integration")
104
+
105
+ with gr.Row():
106
+ novita_btn = gr.Button("🔗 Test Novita AI Connection", variant="secondary")
107
+
108
+ with gr.Tab("Scripts Info"):
109
+ gr.Markdown("### Available Scripts")
110
+
111
+ with gr.Row():
112
+ list_btn = gr.Button("📋 List All Scripts", variant="secondary")
113
+
114
+ # Output area
115
+ output = gr.Textbox(
116
+ label="Output",
117
+ lines=20,
118
+ max_lines=30,
119
+ show_copy_button=True
120
+ )
121
+
122
+ # Event handlers
123
+ check_btn.click(
124
+ check_training_ready,
125
+ outputs=output
126
+ )
127
+
128
+ setup_btn.click(
129
+ setup_textilindo_training,
130
+ outputs=output
131
+ )
132
+
133
+ train_btn.click(
134
+ train_textilindo_ai,
135
+ outputs=output
136
+ )
137
+
138
+ dataset_btn.click(
139
+ create_sample_dataset,
140
+ outputs=output
141
+ )
142
+
143
+ test_btn.click(
144
+ test_textilindo_ai,
145
+ outputs=output
146
+ )
147
+
148
+ novita_btn.click(
149
+ test_novita_connection,
150
+ outputs=output
151
+ )
152
+
153
+ list_btn.click(
154
+ list_available_scripts,
155
+ outputs=output
156
+ )
157
+
158
+ return interface
159
+
160
+ def main():
161
+ """Main function"""
162
+ print("🚀 Starting Textilindo AI Assistant - Hugging Face Spaces")
163
+ print("=" * 60)
164
+
165
+ # Check if we're in the right directory
166
+ if not Path("scripts").exists():
167
+ print("❌ Scripts directory not found. Please ensure you're in the correct directory.")
168
+ sys.exit(1)
169
+
170
+ # Create and launch the interface
171
+ interface = create_interface()
172
+
173
+ # Launch the interface
174
+ interface.launch(
175
+ server_name="0.0.0.0",
176
+ server_port=7860,
177
+ share=False,
178
+ debug=False
179
+ )
180
+
181
+ if __name__ == "__main__":
182
+ main()
deploy_final.py ADDED
@@ -0,0 +1,190 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Final deployment script for Textilindo AI Assistant to Hugging Face Spaces
4
+ """
5
+
6
+ import os
7
+ import subprocess
8
+ import sys
9
+ from pathlib import Path
10
+
11
+ def check_huggingface_cli():
12
+ """Check if huggingface-cli is available"""
13
+ try:
14
+ result = subprocess.run(["huggingface-cli", "--version"], capture_output=True, text=True)
15
+ if result.returncode == 0:
16
+ print("✅ Hugging Face CLI is available")
17
+ return True
18
+ else:
19
+ print("❌ Hugging Face CLI not found")
20
+ return False
21
+ except FileNotFoundError:
22
+ print("❌ Hugging Face CLI not found")
23
+ return False
24
+
25
+ def login_to_huggingface():
26
+ """Login to Hugging Face"""
27
+ print("🔐 Logging in to Hugging Face...")
28
+ try:
29
+ subprocess.run(["huggingface-cli", "login"], check=True)
30
+ print("✅ Successfully logged in to Hugging Face")
31
+ return True
32
+ except subprocess.CalledProcessError:
33
+ print("❌ Failed to login to Hugging Face")
34
+ return False
35
+
36
+ def create_space():
37
+ """Create a new Hugging Face Space"""
38
+ space_name = input("Enter your Hugging Face username: ").strip()
39
+ if not space_name:
40
+ print("❌ Username is required")
41
+ return None
42
+
43
+ space_repo = f"{space_name}/textilindo-ai-assistant"
44
+
45
+ print(f"🚀 Creating space: {space_repo}")
46
+ try:
47
+ subprocess.run([
48
+ "huggingface-cli", "repo", "create",
49
+ "textilindo-ai-assistant",
50
+ "--type", "space",
51
+ "--sdk", "gradio"
52
+ ], check=True)
53
+ print(f"✅ Space created successfully: https://huggingface.co/spaces/{space_repo}")
54
+ return space_repo
55
+ except subprocess.CalledProcessError:
56
+ print("❌ Failed to create space")
57
+ return None
58
+
59
+ def prepare_files():
60
+ """Prepare files for deployment"""
61
+ print("📁 Preparing files for deployment...")
62
+
63
+ # Check if all required files exist
64
+ required_files = [
65
+ "app.py",
66
+ "requirements.txt",
67
+ "README.md",
68
+ "configs/system_prompt.md",
69
+ "data/textilindo_training_data.jsonl"
70
+ ]
71
+
72
+ missing_files = []
73
+ for file in required_files:
74
+ if not os.path.exists(file):
75
+ missing_files.append(file)
76
+
77
+ if missing_files:
78
+ print(f"❌ Missing required files: {missing_files}")
79
+ return False
80
+
81
+ print("✅ All required files are present")
82
+ return True
83
+
84
+ def deploy_files(space_repo):
85
+ """Deploy files to the space"""
86
+ print(f"📤 Deploying files to {space_repo}...")
87
+
88
+ # Clone the space repository
89
+ clone_url = f"https://huggingface.co/spaces/{space_repo}"
90
+ temp_dir = "temp_space"
91
+
92
+ try:
93
+ # Remove temp directory if it exists
94
+ if os.path.exists(temp_dir):
95
+ import shutil
96
+ shutil.rmtree(temp_dir)
97
+
98
+ # Clone the repository
99
+ subprocess.run(["git", "clone", clone_url, temp_dir], check=True)
100
+ print("✅ Repository cloned successfully")
101
+
102
+ # Copy files to the cloned repository
103
+ files_to_copy = [
104
+ "app.py",
105
+ "requirements.txt",
106
+ "README.md",
107
+ "configs/",
108
+ "data/"
109
+ ]
110
+
111
+ for file in files_to_copy:
112
+ if os.path.exists(file):
113
+ if os.path.isdir(file):
114
+ # Copy directory
115
+ subprocess.run(["cp", "-r", file, temp_dir], check=True)
116
+ else:
117
+ # Copy file
118
+ subprocess.run(["cp", file, temp_dir], check=True)
119
+ print(f"✅ Copied {file}")
120
+
121
+ # Change to the cloned directory
122
+ os.chdir(temp_dir)
123
+
124
+ # Add all files to git
125
+ subprocess.run(["git", "add", "."], check=True)
126
+
127
+ # Commit changes
128
+ subprocess.run(["git", "commit", "-m", "Initial deployment of Textilindo AI Assistant"], check=True)
129
+
130
+ # Push to the space
131
+ subprocess.run(["git", "push"], check=True)
132
+
133
+ print("✅ Files deployed successfully!")
134
+ return True
135
+
136
+ except subprocess.CalledProcessError as e:
137
+ print(f"❌ Deployment failed: {e}")
138
+ return False
139
+ finally:
140
+ # Clean up
141
+ if os.path.exists(temp_dir):
142
+ import shutil
143
+ shutil.rmtree(temp_dir)
144
+ os.chdir("..")
145
+
146
+ def main():
147
+ print("🚀 Textilindo AI Assistant - Hugging Face Spaces Deployment")
148
+ print("=" * 60)
149
+
150
+ # Check if we're in the right directory
151
+ if not os.path.exists("app.py"):
152
+ print("❌ app.py not found. Please run this script from the project root directory.")
153
+ return
154
+
155
+ # Prepare files
156
+ if not prepare_files():
157
+ return
158
+
159
+ # Check Hugging Face CLI
160
+ if not check_huggingface_cli():
161
+ print("📦 Installing Hugging Face CLI...")
162
+ try:
163
+ subprocess.run([sys.executable, "-m", "pip", "install", "huggingface_hub"], check=True)
164
+ print("✅ Hugging Face CLI installed")
165
+ except subprocess.CalledProcessError:
166
+ print("❌ Failed to install Hugging Face CLI")
167
+ return
168
+
169
+ # Login to Hugging Face
170
+ if not login_to_huggingface():
171
+ return
172
+
173
+ # Create space
174
+ space_repo = create_space()
175
+ if not space_repo:
176
+ return
177
+
178
+ # Deploy files
179
+ if deploy_files(space_repo):
180
+ print("\n🎉 Deployment completed successfully!")
181
+ print(f"🌐 Your app is available at: https://huggingface.co/spaces/{space_repo}")
182
+ print("\n📋 Next steps:")
183
+ print("1. Wait for the space to build (usually takes 2-5 minutes)")
184
+ print("2. Test your application")
185
+ print("3. Share the link with others!")
186
+ else:
187
+ print("\n❌ Deployment failed. Please check the error messages above.")
188
+
189
+ if __name__ == "__main__":
190
+ main()
deploy_to_hf_space.py ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Deployment script for Hugging Face Spaces
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ import subprocess
9
+ from pathlib import Path
10
+
11
+ def check_requirements():
12
+ """Check if required tools are available"""
13
+ print("🔍 Checking requirements...")
14
+
15
+ # Check if git is available
16
+ try:
17
+ subprocess.run(["git", "--version"], capture_output=True, check=True)
18
+ print("✅ Git available")
19
+ except (subprocess.CalledProcessError, FileNotFoundError):
20
+ print("❌ Git not found. Please install git.")
21
+ return False
22
+
23
+ # Check if huggingface_hub is available
24
+ try:
25
+ import huggingface_hub
26
+ print("✅ Hugging Face Hub available")
27
+ except ImportError:
28
+ print("❌ Hugging Face Hub not found. Install with: pip install huggingface_hub")
29
+ return False
30
+
31
+ return True
32
+
33
+ def setup_git_lfs():
34
+ """Setup Git LFS for large files"""
35
+ print("📁 Setting up Git LFS...")
36
+ try:
37
+ subprocess.run(["git", "lfs", "install"], check=True)
38
+ print("✅ Git LFS installed")
39
+ return True
40
+ except subprocess.CalledProcessError:
41
+ print("❌ Failed to install Git LFS")
42
+ return False
43
+
44
+ def create_gitignore():
45
+ """Create .gitignore for the project"""
46
+ print("📝 Creating .gitignore...")
47
+
48
+ gitignore_content = """
49
+ # Python
50
+ __pycache__/
51
+ *.py[cod]
52
+ *$py.class
53
+ *.so
54
+ .Python
55
+ build/
56
+ develop-eggs/
57
+ dist/
58
+ downloads/
59
+ eggs/
60
+ .eggs/
61
+ lib/
62
+ lib64/
63
+ parts/
64
+ sdist/
65
+ var/
66
+ wheels/
67
+ *.egg-info/
68
+ .installed.cfg
69
+ *.egg
70
+
71
+ # Virtual environments
72
+ venv/
73
+ env/
74
+ ENV/
75
+
76
+ # IDE
77
+ .vscode/
78
+ .idea/
79
+ *.swp
80
+ *.swo
81
+
82
+ # OS
83
+ .DS_Store
84
+ Thumbs.db
85
+
86
+ # Model files (too large for git)
87
+ models/
88
+ *.bin
89
+ *.safetensors
90
+ *.pt
91
+ *.pth
92
+
93
+ # Cache
94
+ .cache/
95
+ __pycache__/
96
+
97
+ # Logs
98
+ *.log
99
+ logs/
100
+
101
+ # Temporary files
102
+ *.tmp
103
+ *.temp
104
+
105
+ # Environment variables
106
+ .env
107
+ .env.local
108
+ .env.production
109
+
110
+ # Hugging Face
111
+ .huggingface/
112
+ transformers_cache/
113
+ """
114
+
115
+ with open(".gitignore", "w") as f:
116
+ f.write(gitignore_content.strip())
117
+
118
+ print("✅ .gitignore created")
119
+
120
+ def create_readme():
121
+ """Create README.md for the Space"""
122
+ print("📖 Creating README.md...")
123
+
124
+ readme_content = """---
125
+ title: Textilindo AI Assistant
126
+ emoji: 🤖
127
+ colorFrom: blue
128
+ colorTo: purple
129
+ sdk: docker
130
+ pinned: false
131
+ license: mit
132
+ app_port: 7860
133
+ ---
134
+
135
+ # Textilindo AI Assistant
136
+
137
+ AI Assistant for Textilindo with training and inference capabilities.
138
+
139
+ ## Features
140
+
141
+ - 🤖 AI model training with LoRA
142
+ - 📊 Dataset creation and management
143
+ - 🧪 Model testing and inference
144
+ - 🔗 External service integration
145
+ - 📱 Web interface for all operations
146
+
147
+ ## Usage
148
+
149
+ 1. **Check Training Ready**: Verify all components are ready
150
+ 2. **Create Dataset**: Generate sample training data
151
+ 3. **Setup Training**: Download models and setup environment
152
+ 4. **Train Model**: Start the training process
153
+ 5. **Test Model**: Interact with the trained model
154
+
155
+ ## Hardware Requirements
156
+
157
+ - **Minimum**: CPU Basic (2 vCPU, 8GB RAM)
158
+ - **Recommended**: GPU Basic (1 T4 GPU, 16GB RAM)
159
+ - **For Training**: GPU A10G or higher
160
+
161
+ ## Environment Variables
162
+
163
+ Set these in your Space settings:
164
+
165
+ - `HUGGINGFACE_TOKEN`: Your Hugging Face token (optional)
166
+ - `NOVITA_API_KEY`: Your Novita AI API key (optional)
167
+
168
+ ## Support
169
+
170
+ For issues and questions, check the logs and health endpoint.
171
+ """
172
+
173
+ with open("README.md", "w") as f:
174
+ f.write(readme_content)
175
+
176
+ print("✅ README.md created")
177
+
178
+ def main():
179
+ """Main deployment function"""
180
+ print("🚀 Textilindo AI Assistant - Hugging Face Spaces Deployment")
181
+ print("=" * 60)
182
+
183
+ # Check requirements
184
+ if not check_requirements():
185
+ print("❌ Requirements not met. Please install missing tools.")
186
+ sys.exit(1)
187
+
188
+ # Setup Git LFS
189
+ if not setup_git_lfs():
190
+ print("❌ Failed to setup Git LFS")
191
+ sys.exit(1)
192
+
193
+ # Create necessary files
194
+ create_gitignore()
195
+ create_readme()
196
+
197
+ print("\n✅ Deployment preparation complete!")
198
+ print("\n📋 Next steps:")
199
+ print("1. Create a new Hugging Face Space")
200
+ print("2. Use Docker SDK")
201
+ print("3. Set hardware to GPU Basic or higher")
202
+ print("4. Push your code to the Space repository")
203
+ print("5. Set environment variables if needed")
204
+ print("\n🔗 Your Space will be available at: https://huggingface.co/spaces/your-username/your-space-name")
205
+
206
+ if __name__ == "__main__":
207
+ main()
health_check.py ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Health check endpoint for Hugging Face Spaces
4
+ """
5
+
6
+ from flask import Flask, jsonify
7
+ import os
8
+ from pathlib import Path
9
+
10
+ app = Flask(__name__)
11
+
12
+ @app.route('/health')
13
+ def health_check():
14
+ """Health check endpoint"""
15
+ try:
16
+ # Check if required directories exist
17
+ required_dirs = ['scripts', 'configs', 'data']
18
+ missing_dirs = []
19
+
20
+ for dir_name in required_dirs:
21
+ if not Path(dir_name).exists():
22
+ missing_dirs.append(dir_name)
23
+
24
+ # Check if scripts directory has files
25
+ scripts_dir = Path("scripts")
26
+ script_count = len(list(scripts_dir.glob("*.py"))) if scripts_dir.exists() else 0
27
+
28
+ status = {
29
+ "status": "healthy" if not missing_dirs else "degraded",
30
+ "missing_directories": missing_dirs,
31
+ "scripts_available": script_count,
32
+ "python_version": os.sys.version,
33
+ "working_directory": str(Path.cwd())
34
+ }
35
+
36
+ return jsonify(status)
37
+
38
+ except Exception as e:
39
+ return jsonify({
40
+ "status": "unhealthy",
41
+ "error": str(e)
42
+ }), 500
43
+
44
+ @app.route('/')
45
+ def root():
46
+ """Root endpoint"""
47
+ return jsonify({
48
+ "message": "Textilindo AI Assistant - Hugging Face Spaces",
49
+ "status": "running",
50
+ "endpoints": {
51
+ "health": "/health",
52
+ "app": "/app"
53
+ }
54
+ })
55
+
56
+ if __name__ == "__main__":
57
+ app.run(host="0.0.0.0", port=7860, debug=False)
push_to_hf_space.py ADDED
@@ -0,0 +1,269 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Script to push Textilindo AI Assistant to Hugging Face Spaces
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ import subprocess
9
+ import shutil
10
+ from pathlib import Path
11
+
12
+ def check_git_status():
13
+ """Check git status and setup"""
14
+ print("🔍 Checking Git status...")
15
+
16
+ try:
17
+ # Check if we're in a git repository
18
+ result = subprocess.run(["git", "status"], capture_output=True, text=True)
19
+ if result.returncode != 0:
20
+ print("❌ Not in a git repository. Initializing...")
21
+ subprocess.run(["git", "init"], check=True)
22
+ print("✅ Git repository initialized")
23
+ else:
24
+ print("✅ Git repository found")
25
+
26
+ return True
27
+ except subprocess.CalledProcessError as e:
28
+ print(f"❌ Git error: {e}")
29
+ return False
30
+ except FileNotFoundError:
31
+ print("❌ Git not found. Please install git.")
32
+ return False
33
+
34
+ def setup_git_lfs():
35
+ """Setup Git LFS for large files"""
36
+ print("📁 Setting up Git LFS...")
37
+
38
+ try:
39
+ # Install Git LFS
40
+ subprocess.run(["git", "lfs", "install"], check=True)
41
+ print("✅ Git LFS installed")
42
+
43
+ # Create .gitattributes for LFS
44
+ gitattributes = """
45
+ # Large model files
46
+ *.bin filter=lfs diff=lfs merge=lfs -text
47
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
48
+ *.pt filter=lfs diff=lfs merge=lfs -text
49
+ *.pth filter=lfs diff=lfs merge=lfs -text
50
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
51
+ *.pkl filter=lfs diff=lfs merge=lfs -text
52
+ *.h5 filter=lfs diff=lfs merge=lfs -text
53
+ *.hdf5 filter=lfs diff=lfs merge=lfs -text
54
+ *.tar.gz filter=lfs diff=lfs merge=lfs -text
55
+ *.zip filter=lfs diff=lfs merge=lfs -text
56
+ """
57
+
58
+ with open(".gitattributes", "w") as f:
59
+ f.write(gitattributes.strip())
60
+
61
+ print("✅ .gitattributes created")
62
+ return True
63
+
64
+ except subprocess.CalledProcessError as e:
65
+ print(f"❌ Git LFS setup failed: {e}")
66
+ return False
67
+
68
+ def create_gitignore():
69
+ """Create comprehensive .gitignore"""
70
+ print("📝 Creating .gitignore...")
71
+
72
+ gitignore_content = """
73
+ # Python
74
+ __pycache__/
75
+ *.py[cod]
76
+ *$py.class
77
+ *.so
78
+ .Python
79
+ build/
80
+ develop-eggs/
81
+ dist/
82
+ downloads/
83
+ eggs/
84
+ .eggs/
85
+ lib/
86
+ lib64/
87
+ parts/
88
+ sdist/
89
+ var/
90
+ wheels/
91
+ *.egg-info/
92
+ .installed.cfg
93
+ *.egg
94
+
95
+ # Virtual environments
96
+ venv/
97
+ env/
98
+ ENV/
99
+ .venv/
100
+
101
+ # IDE
102
+ .vscode/
103
+ .idea/
104
+ *.swp
105
+ *.swo
106
+ *.sublime-*
107
+
108
+ # OS
109
+ .DS_Store
110
+ Thumbs.db
111
+ *.tmp
112
+ *.temp
113
+
114
+ # Model files (use LFS for these)
115
+ models/
116
+ *.bin
117
+ *.safetensors
118
+ *.pt
119
+ *.pth
120
+ *.ckpt
121
+
122
+ # Cache
123
+ .cache/
124
+ __pycache__/
125
+ transformers_cache/
126
+ .huggingface/
127
+
128
+ # Logs
129
+ *.log
130
+ logs/
131
+ wandb/
132
+
133
+ # Environment variables
134
+ .env
135
+ .env.local
136
+ .env.production
137
+ .env.staging
138
+
139
+ # Jupyter
140
+ .ipynb_checkpoints/
141
+
142
+ # PyTorch
143
+ *.pth
144
+ *.pt
145
+
146
+ # Data files (use LFS for large ones)
147
+ data/*.jsonl
148
+ data/*.json
149
+ data/*.csv
150
+ data/*.parquet
151
+ """
152
+
153
+ with open(".gitignore", "w") as f:
154
+ f.write(gitignore_content.strip())
155
+
156
+ print("✅ .gitignore created")
157
+
158
+ def prepare_dockerfile():
159
+ """Ensure we're using the correct Dockerfile"""
160
+ print("🐳 Preparing Dockerfile...")
161
+
162
+ # Copy the HF Spaces Dockerfile to the root
163
+ if Path("Dockerfile_hf_spaces").exists():
164
+ shutil.copy("Dockerfile_hf_spaces", "Dockerfile")
165
+ print("✅ Dockerfile prepared for HF Spaces")
166
+ return True
167
+ else:
168
+ print("❌ Dockerfile_hf_spaces not found")
169
+ return False
170
+
171
+ def create_space_config():
172
+ """Create Hugging Face Space configuration"""
173
+ print("⚙️ Creating Space configuration...")
174
+
175
+ # Create .huggingface directory
176
+ hf_dir = Path(".huggingface")
177
+ hf_dir.mkdir(exist_ok=True)
178
+
179
+ # Create space configuration
180
+ space_config = {
181
+ "title": "Textilindo AI Assistant",
182
+ "emoji": "🤖",
183
+ "colorFrom": "blue",
184
+ "colorTo": "purple",
185
+ "sdk": "docker",
186
+ "pinned": False,
187
+ "license": "mit",
188
+ "app_port": 7860,
189
+ "hardware": "gpu-basic"
190
+ }
191
+
192
+ # Write to README.md frontmatter (already done above)
193
+ print("✅ Space configuration ready")
194
+
195
+ def commit_and_push():
196
+ """Commit and push to repository"""
197
+ print("📤 Committing and pushing...")
198
+
199
+ try:
200
+ # Add all files
201
+ subprocess.run(["git", "add", "."], check=True)
202
+ print("✅ Files staged")
203
+
204
+ # Commit
205
+ commit_message = "Initial commit: Textilindo AI Assistant for HF Spaces"
206
+ subprocess.run(["git", "commit", "-m", commit_message], check=True)
207
+ print("✅ Changes committed")
208
+
209
+ # Check if remote exists
210
+ result = subprocess.run(["git", "remote", "-v"], capture_output=True, text=True)
211
+ if not result.stdout.strip():
212
+ print("⚠️ No remote repository found.")
213
+ print("Please add your Hugging Face Space repository as remote:")
214
+ print("git remote add origin https://huggingface.co/spaces/your-username/your-space-name")
215
+ return False
216
+
217
+ # Push to remote
218
+ subprocess.run(["git", "push", "origin", "main"], check=True)
219
+ print("✅ Pushed to remote repository")
220
+ return True
221
+
222
+ except subprocess.CalledProcessError as e:
223
+ print(f"❌ Git operation failed: {e}")
224
+ return False
225
+
226
+ def main():
227
+ """Main deployment function"""
228
+ print("🚀 Textilindo AI Assistant - Push to Hugging Face Spaces")
229
+ print("=" * 60)
230
+
231
+ # Check if we're in the right directory
232
+ if not Path("scripts").exists():
233
+ print("❌ Scripts directory not found. Please run from the project root.")
234
+ sys.exit(1)
235
+
236
+ # Step 1: Check git status
237
+ if not check_git_status():
238
+ sys.exit(1)
239
+
240
+ # Step 2: Setup Git LFS
241
+ if not setup_git_lfs():
242
+ sys.exit(1)
243
+
244
+ # Step 3: Create .gitignore
245
+ create_gitignore()
246
+
247
+ # Step 4: Prepare Dockerfile
248
+ if not prepare_dockerfile():
249
+ sys.exit(1)
250
+
251
+ # Step 5: Create space configuration
252
+ create_space_config()
253
+
254
+ # Step 6: Commit and push
255
+ if commit_and_push():
256
+ print("\n🎉 Successfully pushed to Hugging Face Spaces!")
257
+ print("\n📋 Next steps:")
258
+ print("1. Go to your Hugging Face Space")
259
+ print("2. Check the build logs")
260
+ print("3. Set environment variables if needed")
261
+ print("4. Test the application")
262
+ else:
263
+ print("\n❌ Failed to push. Please check the errors above.")
264
+ print("\n💡 Manual steps:")
265
+ print("1. Add remote: git remote add origin <your-hf-space-url>")
266
+ print("2. Push: git push origin main")
267
+
268
+ if __name__ == "__main__":
269
+ main()
requirements.txt CHANGED
@@ -1,5 +1,42 @@
1
- gradio>=4.0.0
2
- requests>=2.31.0
3
- numpy>=1.24.0
4
- pandas>=2.0.0
5
- python-dotenv>=1.0.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core ML/AI packages with compatible versions
2
+ torch==2.1.0
3
+ transformers==4.35.0
4
+ accelerate==0.24.0
5
+ peft==0.6.0
6
+ datasets==2.14.0
7
+ scikit-learn==1.3.0
8
+
9
+ # Hugging Face Hub - use compatible version
10
+ huggingface-hub>=0.16.4,<0.19.0
11
+
12
+ # Web framework
13
+ gradio==4.44.0
14
+ flask==3.0.0
15
+ flask-cors==4.0.0
16
+
17
+ # Utilities
18
+ requests==2.31.0
19
+ numpy==1.24.3
20
+
21
+ # Additional dependencies for Hugging Face Spaces
22
+ aiofiles>=22.0
23
+ fastapi>=0.100.0
24
+ uvicorn>=0.20.0
25
+ python-multipart>=0.0.9
26
+
27
+ # For tokenizers compatibility
28
+ tokenizers>=0.14.0,<0.15.0
29
+
30
+ # For datasets compatibility
31
+ pyarrow>=8.0.0
32
+ dill>=0.3.0
33
+ xxhash
34
+ multiprocess
35
+
36
+ # For accelerate
37
+ psutil
38
+
39
+ # For scikit-learn
40
+ scipy>=1.5.0
41
+ joblib>=1.1.1
42
+ threadpoolctl>=2.0.0
requirements_fixed.txt CHANGED
@@ -1,7 +1,42 @@
1
- flask>=2.3.0
2
- gunicorn>=21.0.0
3
- requests>=2.31.0
4
- openai>=1.0.0
5
- numpy>=1.24.0
6
- pandas>=2.0.0
7
- python-dotenv>=1.0.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Core ML/AI packages with compatible versions
2
+ torch==2.1.0
3
+ transformers==4.35.0
4
+ accelerate==0.24.0
5
+ peft==0.6.0
6
+ datasets==2.14.0
7
+ scikit-learn==1.3.0
8
+
9
+ # Hugging Face Hub - use compatible version
10
+ huggingface-hub>=0.16.4,<0.19.0
11
+
12
+ # Web framework
13
+ gradio==4.44.0
14
+ flask==3.0.0
15
+ flask-cors==4.0.0
16
+
17
+ # Utilities
18
+ requests==2.31.0
19
+ numpy==1.24.3
20
+
21
+ # Additional dependencies for Hugging Face Spaces
22
+ aiofiles>=22.0
23
+ fastapi>=0.100.0
24
+ uvicorn>=0.20.0
25
+ python-multipart>=0.0.9
26
+
27
+ # For tokenizers compatibility
28
+ tokenizers>=0.14.0,<0.15.0
29
+
30
+ # For datasets compatibility
31
+ pyarrow>=8.0.0
32
+ dill>=0.3.0
33
+ xxhash
34
+ multiprocess
35
+
36
+ # For accelerate
37
+ psutil
38
+
39
+ # For scikit-learn
40
+ scipy>=1.5.0
41
+ joblib>=1.1.1
42
+ threadpoolctl>=2.0.0
setup_hf_space.py ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Setup script for Hugging Face Spaces deployment
4
+ """
5
+
6
+ import os
7
+ import sys
8
+ import subprocess
9
+ from pathlib import Path
10
+
11
+ def check_requirements():
12
+ """Check if all requirements are met"""
13
+ print("🔍 Checking requirements...")
14
+
15
+ # Check git
16
+ try:
17
+ subprocess.run(["git", "--version"], capture_output=True, check=True)
18
+ print("✅ Git available")
19
+ except (subprocess.CalledProcessError, FileNotFoundError):
20
+ print("❌ Git not found. Please install git.")
21
+ return False
22
+
23
+ # Check if we're in the right directory
24
+ if not Path("scripts").exists():
25
+ print("❌ Scripts directory not found. Please run from project root.")
26
+ return False
27
+
28
+ print("✅ All requirements met")
29
+ return True
30
+
31
+ def create_space_repository():
32
+ """Guide user to create HF Space repository"""
33
+ print("\n🌐 Creating Hugging Face Space Repository")
34
+ print("=" * 50)
35
+ print("1. Go to https://huggingface.co/spaces")
36
+ print("2. Click 'Create new Space'")
37
+ print("3. Fill in the details:")
38
+ print(" - Name: textilindo-ai-assistant")
39
+ print(" - SDK: Docker")
40
+ print(" - Hardware: GPU Basic")
41
+ print(" - Visibility: Public or Private")
42
+ print("4. Click 'Create Space'")
43
+ print("5. Copy the repository URL (e.g., https://huggingface.co/spaces/your-username/textilindo-ai-assistant)")
44
+
45
+ return input("\n📋 Enter your Hugging Face Space URL: ").strip()
46
+
47
+ def setup_git_remote(space_url):
48
+ """Setup git remote for the space"""
49
+ print(f"\n🔗 Setting up git remote...")
50
+
51
+ try:
52
+ # Remove existing remote if any
53
+ subprocess.run(["git", "remote", "remove", "origin"], capture_output=True)
54
+
55
+ # Add new remote
56
+ subprocess.run(["git", "remote", "add", "origin", space_url], check=True)
57
+ print(f"✅ Remote added: {space_url}")
58
+ return True
59
+
60
+ except subprocess.CalledProcessError as e:
61
+ print(f"❌ Failed to add remote: {e}")
62
+ return False
63
+
64
+ def main():
65
+ """Main setup function"""
66
+ print("🚀 Textilindo AI Assistant - Hugging Face Spaces Setup")
67
+ print("=" * 60)
68
+
69
+ # Check requirements
70
+ if not check_requirements():
71
+ sys.exit(1)
72
+
73
+ # Guide user to create space
74
+ space_url = create_space_repository()
75
+
76
+ if not space_url:
77
+ print("❌ No space URL provided. Exiting.")
78
+ sys.exit(1)
79
+
80
+ # Setup git remote
81
+ if not setup_git_remote(space_url):
82
+ sys.exit(1)
83
+
84
+ print("\n✅ Setup complete!")
85
+ print("\n📋 Next steps:")
86
+ print("1. Run: python push_to_hf_space.py")
87
+ print("2. Check your Hugging Face Space")
88
+ print("3. Monitor the build logs")
89
+ print("4. Test the application when ready")
90
+
91
+ if __name__ == "__main__":
92
+ main()
test_build.py ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script to verify the build configuration
4
+ """
5
+
6
+ import sys
7
+ import subprocess
8
+ from pathlib import Path
9
+
10
+ def test_imports():
11
+ """Test if all required packages can be imported"""
12
+ print("🔍 Testing package imports...")
13
+
14
+ required_packages = [
15
+ "torch",
16
+ "transformers",
17
+ "accelerate",
18
+ "peft",
19
+ "datasets",
20
+ "gradio",
21
+ "flask",
22
+ "requests",
23
+ "numpy",
24
+ "sklearn"
25
+ ]
26
+
27
+ failed_imports = []
28
+
29
+ for package in required_packages:
30
+ try:
31
+ __import__(package)
32
+ print(f"✅ {package}")
33
+ except ImportError as e:
34
+ print(f"❌ {package}: {e}")
35
+ failed_imports.append(package)
36
+
37
+ return len(failed_imports) == 0
38
+
39
+ def test_scripts():
40
+ """Test if scripts can be found and are executable"""
41
+ print("\n🔍 Testing scripts...")
42
+
43
+ scripts_dir = Path("scripts")
44
+ if not scripts_dir.exists():
45
+ print("❌ Scripts directory not found")
46
+ return False
47
+
48
+ required_scripts = [
49
+ "check_training_ready.py",
50
+ "create_sample_dataset.py",
51
+ "setup_textilindo_training.py",
52
+ "train_textilindo_ai.py",
53
+ "test_textilindo_ai.py"
54
+ ]
55
+
56
+ missing_scripts = []
57
+
58
+ for script in required_scripts:
59
+ script_path = scripts_dir / script
60
+ if script_path.exists():
61
+ print(f"✅ {script}")
62
+ else:
63
+ print(f"❌ {script} not found")
64
+ missing_scripts.append(script)
65
+
66
+ return len(missing_scripts) == 0
67
+
68
+ def test_config_files():
69
+ """Test if configuration files exist"""
70
+ print("\n🔍 Testing configuration files...")
71
+
72
+ config_files = [
73
+ "requirements.txt",
74
+ "app.py",
75
+ "app_hf_spaces.py",
76
+ "health_check.py"
77
+ ]
78
+
79
+ missing_files = []
80
+
81
+ for config_file in config_files:
82
+ if Path(config_file).exists():
83
+ print(f"✅ {config_file}")
84
+ else:
85
+ print(f"❌ {config_file} not found")
86
+ missing_files.append(config_file)
87
+
88
+ return len(missing_files) == 0
89
+
90
+ def test_script_execution():
91
+ """Test if a simple script can be executed"""
92
+ print("\n🔍 Testing script execution...")
93
+
94
+ try:
95
+ # Test the check_training_ready script
96
+ result = subprocess.run([
97
+ sys.executable, "scripts/check_training_ready.py"
98
+ ], capture_output=True, text=True, timeout=30)
99
+
100
+ if result.returncode == 0:
101
+ print("✅ Script execution successful")
102
+ return True
103
+ else:
104
+ print(f"❌ Script execution failed: {result.stderr}")
105
+ return False
106
+
107
+ except subprocess.TimeoutExpired:
108
+ print("❌ Script execution timed out")
109
+ return False
110
+ except Exception as e:
111
+ print(f"❌ Script execution error: {e}")
112
+ return False
113
+
114
+ def main():
115
+ """Main test function"""
116
+ print("🧪 Textilindo AI Assistant - Build Test")
117
+ print("=" * 50)
118
+
119
+ tests = [
120
+ ("Package Imports", test_imports),
121
+ ("Scripts Availability", test_scripts),
122
+ ("Configuration Files", test_config_files),
123
+ ("Script Execution", test_script_execution)
124
+ ]
125
+
126
+ results = []
127
+
128
+ for test_name, test_func in tests:
129
+ print(f"\n📋 {test_name}")
130
+ print("-" * 30)
131
+ result = test_func()
132
+ results.append((test_name, result))
133
+
134
+ print("\n" + "=" * 50)
135
+ print("📊 Test Results:")
136
+
137
+ all_passed = True
138
+ for test_name, result in results:
139
+ status = "✅ PASS" if result else "❌ FAIL"
140
+ print(f"{status} {test_name}")
141
+ if not result:
142
+ all_passed = False
143
+
144
+ if all_passed:
145
+ print("\n🎉 All tests passed! Build configuration is ready.")
146
+ print("\n📋 Next steps:")
147
+ print("1. Push to Hugging Face Space")
148
+ print("2. Set environment variables")
149
+ print("3. Deploy and test")
150
+ else:
151
+ print("\n❌ Some tests failed. Please fix the issues above.")
152
+ sys.exit(1)
153
+
154
+ if __name__ == "__main__":
155
+ main()