---
title: DS-STAR
emoji: β¨
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: mit
short_description: Multi-Agent AI System for Automated Data Science Tasks
tags:
- mcp-in-action-track-consumer
- langgraph
- multi-agent
- data-science
- automation
thumbnail: >-
/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F658be22d0ccb77b89a142f5a%2Ft4RLYTRMF0_VHuQm5ZF91.png
social_media_post: https://x.com/AnuragDeo6/status/1995172899016380619
---
# β¨ DS-STAR
### **D**ata **S**cience - **S**tructured **T**ask **A**nalysis and **R**esolution
[](https://huggingface.co/spaces/Anurag-Deo/DS-STAR)
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://langchain-ai.github.io/langgraph/)
**A powerful multi-agent AI system that automates data science tasks through intelligent collaboration.**
[π Try Demo](https://huggingface.co/spaces/Anurag-Deo/DS-STAR) β’ [π Documentation](#-usage) β’ [π Report Bug](https://github.com/Anurag-Deo/DS-STAR/issues)
---

---
## π― What is DS-STAR?
DS-STAR is a **multi-agent AI system** built with LangGraph that takes your natural language questions about data and automatically:
1. π **Analyzes** your data files to understand their structure
2. π **Plans** a step-by-step approach to answer your question
3. π» **Generates** Python code to perform the analysis
4. β
**Verifies** the solution meets your requirements
5. π **Iterates** with smart backtracking if needed
6. π― **Delivers** polished, accurate results
> **Built for the π€ Hugging Face MCP 1st Birthday Hackathon**
---
## β¨ Key Features
| Feature | Description |
|---------|-------------|
| π€ **Multi-Agent Architecture** | Six specialized agents working in harmony |
| π **Iterative Refinement** | Automatically improves solutions through multiple cycles |
| π **Smart Backtracking** | Intelligently reverts failed approaches |
| π **Auto Data Analysis** | Understands your data structure automatically |
| π» **Code Generation** | Produces clean, executable Python code |
| π **Multi-Provider Support** | Works with Google, OpenAI, Anthropic, or custom APIs |
| π¨ **Modern UI** | Beautiful dark-themed Gradio interface |
---
## ποΈ Architecture
DS-STAR uses a sophisticated multi-agent workflow powered by LangGraph:
```
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Analyzer ββββββΆβ Planner ββββββΆβ Coder β
β π Analyze β β π Plan β β π» Code β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β
βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Finalyzer βββββββ Router βββββββ Verifier β
β π― Polish β β π Route β β β
Verify β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β
βΌ
βββββββββββββββ
β Backtrack β
β β©οΈ Retry β
βββββββββββββββ
```
### Agent Roles
| Agent | Role | Description |
|-------|------|-------------|
| **Analyzer** | π | Examines all data files and creates detailed descriptions |
| **Planner** | π | Generates the next logical step in the solution |
| **Coder** | π» | Implements the plan as executable Python code |
| **Verifier** | β
| Validates if the solution answers the query |
| **Router** | π | Decides to continue, add steps, or backtrack |
| **Finalyzer** | π― | Polishes and formats the final output |
---
## π Quick Start
### Online Demo
Try DS-STAR instantly on Hugging Face Spaces:
π **[Launch DS-STAR Demo](https://huggingface.co/spaces/Anurag-Deo/DS-STAR)**
### Local Installation
```bash
# Clone the repository
git clone https://github.com/Anurag-Deo/DS-STAR.git
cd DS-STAR
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
```
Then open http://localhost:7860 in your browser.
---
## π‘ Usage
### Web Interface
1. **Select Provider** β Choose Google, OpenAI, Anthropic, or Custom
2. **Enter API Key** β Or set via environment variable
3. **Upload Data** β Drop your CSV, JSON, Excel, or Parquet files
4. **Ask Questions** β Type your data science question
5. **Run Analysis** β Click "Run Analysis" and watch the magic!
### Example Queries
```
π "What percentage of transactions use credit cards?"
π "Show me the distribution of transaction amounts"
π "Which category has the highest total sales?"
π "Find correlations between numeric columns"
π "Create a summary statistics report"
```
### Python API
```python
from src.graph import run_ds_star
from src.config import get_llm
# Initialize LLM
llm = get_llm(provider="google", model="gemini-2.0-flash")
# Run DS-STAR
result = run_ds_star(
query="What is the average transaction amount?",
llm=llm,
max_iterations=20
)
```
---
## π Supported Providers
| Provider | Models | Environment Variable |
|----------|--------|---------------------|
| **Google** | Gemini 2.0, 1.5 Pro, 1.5 Flash | `GOOGLE_API_KEY` |
| **OpenAI** | GPT-4o, GPT-4, GPT-3.5 | `OPENAI_API_KEY` |
| **Anthropic** | Claude 3.5, Claude 3 | `ANTHROPIC_API_KEY` |
| **Custom** | Any OpenAI-compatible API | Custom Base URL |
---
## π Project Structure
```
DS-STAR/
βββ π± app.py # Gradio web application
βββ π main.py # CLI entry point
βββ π requirements.txt # Dependencies
βββ π src/
β βββ π€ agents/ # Agent implementations
β β βββ analyzer_agent.py
β β βββ planner_agent.py
β β βββ coder_agent.py
β β βββ verifier_agent.py
β β βββ router_agent.py
β β βββ finalyzer_agent.py
β βββ π§ utils/ # Shared utilities
β β βββ state.py # State schema
β β βββ formatters.py # Text formatting
β β βββ code_execution.py # Safe code execution
β βββ βοΈ config/ # Configuration
β β βββ llm_config.py # LLM setup
β βββ π graph.py # LangGraph workflow
βββ π§ͺ tests/ # Test suite
βββ π data/ # Sample data files
```
---
## π§ͺ Testing
```bash
# Run complete workflow test
python tests/test_complete_workflow.py
# Test individual agents
python -c "from src.agents import test_analyzer; test_analyzer(llm)"
```
---
## π οΈ Configuration
### Environment Variables
```bash
# Set your API keys
export GOOGLE_API_KEY="your-google-api-key"
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
```
### Advanced Settings
| Setting | Default | Description |
|---------|---------|-------------|
| Max Iterations | 20 | Maximum refinement cycles |
| Temperature | 0.0 | LLM temperature (0 = deterministic) |
---
## π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
---
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
## π Acknowledgments
- Thanks to [DS-STAR](https://arxiv.org/abs/2509.21825) authors for inspiration
- Built with [LangGraph](https://langchain-ai.github.io/langgraph/) by LangChain
- UI powered by [Gradio](https://gradio.app/)
- Created for the [π€ Hugging Face MCP 1st Birthday Hackathon](https://huggingface.co/)
---
**Made with β€οΈ by [Anurag Deo](https://github.com/Anurag-Deo)**
β Star this repo if you find it helpful!