DS-STAR / README.md
anurag-deo's picture
Add: DS-STAR Image in Readme
b54f369 verified
|
raw
history blame
9.19 kB
metadata
title: DS-STAR
emoji: 
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: mit
short_description: Multi-Agent AI System for Automated Data Science Tasks
tags:
  - mcp-in-action-track-consumer
  - langgraph
  - multi-agent
  - data-science
  - automation
thumbnail: >-
  /static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F658be22d0ccb77b89a142f5a%2Ft4RLYTRMF0_VHuQm5ZF91.png%3C%2Fspan%3E
social_media_post: https://x.com/AnuragDeo6/status/1995172899016380619

✨ DS-STAR

Data Science - Structured Task Analysis and Resolution

Hugging Face Spaces License: MIT Python 3.10+ LangGraph

A powerful multi-agent AI system that automates data science tasks through intelligent collaboration.

🚀 Try Demo📖 Documentation🐛 Report Bug


DS-STAR Architecture

🎯 What is DS-STAR?

DS-STAR is a multi-agent AI system built with LangGraph that takes your natural language questions about data and automatically:

  1. 📊 Analyzes your data files to understand their structure
  2. 📝 Plans a step-by-step approach to answer your question
  3. 💻 Generates Python code to perform the analysis
  4. Verifies the solution meets your requirements
  5. 🔄 Iterates with smart backtracking if needed
  6. 🎯 Delivers polished, accurate results

Built for the 🤗 Hugging Face MCP 1st Birthday Hackathon


✨ Key Features

Feature Description
🤖 Multi-Agent Architecture Six specialized agents working in harmony
🔄 Iterative Refinement Automatically improves solutions through multiple cycles
🔙 Smart Backtracking Intelligently reverts failed approaches
📊 Auto Data Analysis Understands your data structure automatically
💻 Code Generation Produces clean, executable Python code
🌐 Multi-Provider Support Works with Google, OpenAI, Anthropic, or custom APIs
🎨 Modern UI Beautiful dark-themed Gradio interface

🏗️ Architecture

DS-STAR uses a sophisticated multi-agent workflow powered by LangGraph:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Analyzer  │────▶│   Planner   │────▶│    Coder    │
│  📊 Analyze │     │  📝 Plan    │     │  💻 Code    │
└─────────────┘     └─────────────┘     └─────────────┘
                                               │
                                               ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Finalyzer  │◀────│   Router    │◀────│  Verifier   │
│  🎯 Polish  │     │  🔀 Route   │     │  ✅ Verify  │
└─────────────┘     └─────────────┘     └─────────────┘
                           │
                           ▼
                    ┌─────────────┐
                    │  Backtrack  │
                    │  ↩️ Retry   │
                    └─────────────┘

Agent Roles

Agent Role Description
Analyzer 📊 Examines all data files and creates detailed descriptions
Planner 📝 Generates the next logical step in the solution
Coder 💻 Implements the plan as executable Python code
Verifier Validates if the solution answers the query
Router 🔀 Decides to continue, add steps, or backtrack
Finalyzer 🎯 Polishes and formats the final output

🚀 Quick Start

Online Demo

Try DS-STAR instantly on Hugging Face Spaces:

👉 Launch DS-STAR Demo

Local Installation

# Clone the repository
git clone https://github.com/Anurag-Deo/DS-STAR.git
cd DS-STAR

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

Then open http://localhost:7860 in your browser.


💡 Usage

Web Interface

  1. Select Provider — Choose Google, OpenAI, Anthropic, or Custom
  2. Enter API Key — Or set via environment variable
  3. Upload Data — Drop your CSV, JSON, Excel, or Parquet files
  4. Ask Questions — Type your data science question
  5. Run Analysis — Click "Run Analysis" and watch the magic!

Example Queries

📊 "What percentage of transactions use credit cards?"
📈 "Show me the distribution of transaction amounts"
🏆 "Which category has the highest total sales?"
🔗 "Find correlations between numeric columns"
📋 "Create a summary statistics report"

Python API

from src.graph import run_ds_star
from src.config import get_llm

# Initialize LLM
llm = get_llm(provider="google", model="gemini-2.0-flash")

# Run DS-STAR
result = run_ds_star(
    query="What is the average transaction amount?",
    llm=llm,
    max_iterations=20
)

🔌 Supported Providers

Provider Models Environment Variable
Google Gemini 2.0, 1.5 Pro, 1.5 Flash GOOGLE_API_KEY
OpenAI GPT-4o, GPT-4, GPT-3.5 OPENAI_API_KEY
Anthropic Claude 3.5, Claude 3 ANTHROPIC_API_KEY
Custom Any OpenAI-compatible API Custom Base URL

📁 Project Structure

DS-STAR/
├── 📱 app.py                 # Gradio web application
├── 📜 main.py                # CLI entry point
├── 📋 requirements.txt       # Dependencies
├── 📂 src/
│   ├── 🤖 agents/            # Agent implementations
│   │   ├── analyzer_agent.py
│   │   ├── planner_agent.py
│   │   ├── coder_agent.py
│   │   ├── verifier_agent.py
│   │   ├── router_agent.py
│   │   └── finalyzer_agent.py
│   ├── 🔧 utils/             # Shared utilities
│   │   ├── state.py          # State schema
│   │   ├── formatters.py     # Text formatting
│   │   └── code_execution.py # Safe code execution
│   ├── ⚙️ config/            # Configuration
│   │   └── llm_config.py     # LLM setup
│   └── 🔄 graph.py           # LangGraph workflow
├── 🧪 tests/                 # Test suite
└── 📊 data/                  # Sample data files

🧪 Testing

# Run complete workflow test
python tests/test_complete_workflow.py

# Test individual agents
python -c "from src.agents import test_analyzer; test_analyzer(llm)"

🛠️ Configuration

Environment Variables

# Set your API keys
export GOOGLE_API_KEY="your-google-api-key"
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"

Advanced Settings

Setting Default Description
Max Iterations 20 Maximum refinement cycles
Temperature 0.0 LLM temperature (0 = deterministic)

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🙏 Acknowledgments


Made with ❤️ by Anurag Deo

⭐ Star this repo if you find it helpful!