---
title: Phi-3.5-MoE Expert Assistant
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
entrypoint: start.sh
startup_duration_timeout: 600
pinned: false
license: mit
short_description: AI assistant with expert routing and CPU/GPU support
models:
- microsoft/Phi-3.5-MoE-instruct
---

# 🤖 Phi-3.5-MoE Expert Assistant

A robust, production-ready AI assistant powered by Microsoft's Phi-3.5-MoE model with intelligent expert routing and comprehensive CPU/GPU environment support.

## 🚀 Key Features

- **🧠 Expert Routing**: Automatically routes queries to specialized experts (Code, Math, Reasoning, Multilingual, General)
- **🔧 Environment Adaptive**: Works seamlessly on both CPU and GPU environments
- **🛡️ Robust Dependency Management**: Conditional installation of dependencies based on environment
- **📦 Fault Tolerance**: Handles missing dependencies with fallback mechanisms
- **⚡ Performance Optimized**: Environment-specific optimizations for best performance

## 🔧 Recent Fixes

- ✅ **Missing Dependencies**: Added `einops` to requirements, conditional `flash_attn` installation
- ✅ **Deprecated Parameters**: Fixed all `torch_dtype` → `dtype` usage
- ✅ **CPU Compatibility**: Automatic CPU-safe model revision selection
- ✅ **Error Handling**: Comprehensive fallback mechanisms
- ✅ **Security**: Updated to Gradio 4.44.0+ for security fixes

## 🏗️ Architecture

```
app.py              # Main application entry point
preinstall.py       # Pre-installation script for dependencies
model_patch.py      # Patch for handling missing dependencies
start.sh            # Startup script
requirements.txt    # Core dependencies
```

## 🎯 How It Works

1. **Environment Detection**: Automatically detects CPU vs GPU environment
2. **Dependency Management**: Installs required dependencies based on environment
3. **Model Configuration**: Uses optimal settings for each environment
4. **Expert Routing**: Classifies queries and routes to appropriate expert
5. **Graceful Fallbacks**: Works even when dependencies are missing

## 📊 Performance

| Environment | Startup | Memory | Tokens/sec |
|-------------|---------|--------|------------|
| **CPU**     | 3-5 min | 8-12 GB | 2-5 |
| **GPU**     | 2-3 min | 16-20 GB | 15-30 |

## 🔍 Troubleshooting

If you encounter issues:
1. Check the logs for dependency installation
2. Verify the pre-installation script executed successfully
3. Ensure all required packages are installed
4. Try the fallback mode if model loading fails

---

**Built with ❤️ for reliable, production-ready AI applications**