--- title: Phi-3.5-MoE Expert Assistant emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.44.0 app_file: app.py entrypoint: start.sh startup_duration_timeout: 600 pinned: false license: mit short_description: AI assistant with expert routing and CPU/GPU support models: - microsoft/Phi-3.5-MoE-instruct --- # 🤖 Phi-3.5-MoE Expert Assistant A robust, production-ready AI assistant powered by Microsoft's Phi-3.5-MoE model with intelligent expert routing and comprehensive CPU/GPU environment support. ## 🚀 Key Features - **🧠 Expert Routing**: Automatically routes queries to specialized experts (Code, Math, Reasoning, Multilingual, General) - **🔧 Environment Adaptive**: Works seamlessly on both CPU and GPU environments - **🛡️ Robust Dependency Management**: Conditional installation of dependencies based on environment - **📦 Fault Tolerance**: Handles missing dependencies with fallback mechanisms - **⚡ Performance Optimized**: Environment-specific optimizations for best performance ## 🔧 Recent Fixes - ✅ **Missing Dependencies**: Added `einops` to requirements, conditional `flash_attn` installation - ✅ **Deprecated Parameters**: Fixed all `torch_dtype` → `dtype` usage - ✅ **CPU Compatibility**: Automatic CPU-safe model revision selection - ✅ **Error Handling**: Comprehensive fallback mechanisms - ✅ **Security**: Updated to Gradio 4.44.0+ for security fixes ## 🏗️ Architecture ``` app.py # Main application entry point preinstall.py # Pre-installation script for dependencies model_patch.py # Patch for handling missing dependencies start.sh # Startup script requirements.txt # Core dependencies ``` ## 🎯 How It Works 1. **Environment Detection**: Automatically detects CPU vs GPU environment 2. **Dependency Management**: Installs required dependencies based on environment 3. **Model Configuration**: Uses optimal settings for each environment 4. **Expert Routing**: Classifies queries and routes to appropriate expert 5. **Graceful Fallbacks**: Works even when dependencies are missing ## 📊 Performance | Environment | Startup | Memory | Tokens/sec | |-------------|---------|--------|------------| | **CPU** | 3-5 min | 8-12 GB | 2-5 | | **GPU** | 2-3 min | 16-20 GB | 15-30 | ## 🔍 Troubleshooting If you encounter issues: 1. Check the logs for dependency installation 2. Verify the pre-installation script executed successfully 3. Ensure all required packages are installed 4. Try the fallback mode if model loading fails --- **Built with ❤️ for reliable, production-ready AI applications**