anurag-deo commited on
Commit
8ff817c
·
verified ·
1 Parent(s): af1dd05

Upload folder using huggingface_hub

Browse files
.env.example ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ LLM_PROVIDER=
2
+ LLM_BASE_URL=
3
+ LLM_MODEL=
4
+ LLM_API_KEY=
5
+ GOOGLE_API_KEY=
6
+ ANTHROPIC_API_KEY=""
.github/workflows/update_space.yml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Run Python script
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+
8
+ jobs:
9
+ build:
10
+ runs-on: ubuntu-latest
11
+
12
+ steps:
13
+ - name: Checkout
14
+ uses: actions/checkout@v2
15
+
16
+ - name: Set up Python
17
+ uses: actions/setup-python@v2
18
+ with:
19
+ python-version: '3.9'
20
+
21
+ - name: Install Gradio
22
+ run: python -m pip install gradio
23
+
24
+ - name: Log in to Hugging Face
25
+ run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
26
+
27
+ - name: Deploy to Spaces
28
+ run: gradio deploy
.gitignore ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python-generated files
2
+ __pycache__/
3
+ *.py[oc]
4
+ build/
5
+ dist/
6
+ wheels/
7
+ *.egg-info
8
+
9
+ # Virtual environments
10
+ .venv
11
+
12
+ # .env files
13
+ .env
.python-version ADDED
@@ -0,0 +1 @@
 
 
1
+ 3.13
README.md CHANGED
@@ -1,12 +1,277 @@
1
- ---
2
- title: DS STAR
3
- emoji: 🌍
4
- colorFrom: indigo
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 6.0.1
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: DS-STAR
3
+ emoji:
4
+ colorFrom: purple
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: 6.0.1
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: Multi-Agent AI System for Automated Data Science Tasks
12
+ tags:
13
+ - mcp-in-action-track-consumer
14
+ - langgraph
15
+ - multi-agent
16
+ - data-science
17
+ - automation
18
+ ---
19
+
20
+ <div align="center">
21
+
22
+ # ✨ DS-STAR
23
+
24
+ ### **D**ata **S**cience - **S**tructured **T**ask **A**nalysis and **R**esolution
25
+
26
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Anurag-Deo/DS-STAR)
27
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
28
+ [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
29
+ [![LangGraph](https://img.shields.io/badge/Built%20with-LangGraph-orange)](https://langchain-ai.github.io/langgraph/)
30
+
31
+ **A powerful multi-agent AI system that automates data science tasks through intelligent collaboration.**
32
+
33
+ [🚀 Try Demo](https://huggingface.co/spaces/Anurag-Deo/DS-STAR) • [📖 Documentation](#-usage) • [🐛 Report Bug](https://github.com/Anurag-Deo/DS-STAR/issues)
34
+
35
+ </div>
36
+
37
+ ---
38
+
39
+ ## 🎯 What is DS-STAR?
40
+
41
+ DS-STAR is a **multi-agent AI system** built with LangGraph that takes your natural language questions about data and automatically:
42
+
43
+ 1. 📊 **Analyzes** your data files to understand their structure
44
+ 2. 📝 **Plans** a step-by-step approach to answer your question
45
+ 3. 💻 **Generates** Python code to perform the analysis
46
+ 4. ✅ **Verifies** the solution meets your requirements
47
+ 5. 🔄 **Iterates** with smart backtracking if needed
48
+ 6. 🎯 **Delivers** polished, accurate results
49
+
50
+ > **Built for the 🤗 Hugging Face MCP 1st Birthday Hackathon**
51
+
52
+ ---
53
+
54
+ ## ✨ Key Features
55
+
56
+ | Feature | Description |
57
+ |---------|-------------|
58
+ | 🤖 **Multi-Agent Architecture** | Six specialized agents working in harmony |
59
+ | 🔄 **Iterative Refinement** | Automatically improves solutions through multiple cycles |
60
+ | 🔙 **Smart Backtracking** | Intelligently reverts failed approaches |
61
+ | 📊 **Auto Data Analysis** | Understands your data structure automatically |
62
+ | 💻 **Code Generation** | Produces clean, executable Python code |
63
+ | 🌐 **Multi-Provider Support** | Works with Google, OpenAI, Anthropic, or custom APIs |
64
+ | 🎨 **Modern UI** | Beautiful dark-themed Gradio interface |
65
+
66
+ ---
67
+
68
+ ## 🏗️ Architecture
69
+
70
+ DS-STAR uses a sophisticated multi-agent workflow powered by LangGraph:
71
+
72
+ ```
73
+ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
74
+ │ Analyzer │────▶│ Planner │────▶│ Coder │
75
+ │ 📊 Analyze │ │ 📝 Plan │ │ 💻 Code │
76
+ └─────────────┘ └─────────────┘ └─────────────┘
77
+
78
+
79
+ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
80
+ │ Finalyzer │◀────│ Router │◀────│ Verifier │
81
+ │ 🎯 Polish │ │ 🔀 Route │ │ ✅ Verify │
82
+ └─────────────┘ └─────────────┘ └─────────────┘
83
+
84
+
85
+ ┌─────────────┐
86
+ │ Backtrack │
87
+ │ ↩️ Retry │
88
+ └─────────────┘
89
+ ```
90
+
91
+ ### Agent Roles
92
+
93
+ | Agent | Role | Description |
94
+ |-------|------|-------------|
95
+ | **Analyzer** | 📊 | Examines all data files and creates detailed descriptions |
96
+ | **Planner** | 📝 | Generates the next logical step in the solution |
97
+ | **Coder** | 💻 | Implements the plan as executable Python code |
98
+ | **Verifier** | ✅ | Validates if the solution answers the query |
99
+ | **Router** | 🔀 | Decides to continue, add steps, or backtrack |
100
+ | **Finalyzer** | 🎯 | Polishes and formats the final output |
101
+
102
+ ---
103
+
104
+ ## 🚀 Quick Start
105
+
106
+ ### Online Demo
107
+
108
+ Try DS-STAR instantly on Hugging Face Spaces:
109
+
110
+ 👉 **[Launch DS-STAR Demo](https://huggingface.co/spaces/Anurag-Deo/DS-STAR)**
111
+
112
+ ### Local Installation
113
+
114
+ ```bash
115
+ # Clone the repository
116
+ git clone https://github.com/Anurag-Deo/DS-STAR.git
117
+ cd DS-STAR
118
+
119
+ # Create virtual environment
120
+ python -m venv .venv
121
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
122
+
123
+ # Install dependencies
124
+ pip install -r requirements.txt
125
+
126
+ # Run the application
127
+ python app.py
128
+ ```
129
+
130
+ Then open http://localhost:7860 in your browser.
131
+
132
+ ---
133
+
134
+ ## 💡 Usage
135
+
136
+ ### Web Interface
137
+
138
+ 1. **Select Provider** — Choose Google, OpenAI, Anthropic, or Custom
139
+ 2. **Enter API Key** — Or set via environment variable
140
+ 3. **Upload Data** — Drop your CSV, JSON, Excel, or Parquet files
141
+ 4. **Ask Questions** — Type your data science question
142
+ 5. **Run Analysis** — Click "Run Analysis" and watch the magic!
143
+
144
+ ### Example Queries
145
+
146
+ ```
147
+ 📊 "What percentage of transactions use credit cards?"
148
+ 📈 "Show me the distribution of transaction amounts"
149
+ 🏆 "Which category has the highest total sales?"
150
+ 🔗 "Find correlations between numeric columns"
151
+ 📋 "Create a summary statistics report"
152
+ ```
153
+
154
+ ### Python API
155
+
156
+ ```python
157
+ from src.graph import run_ds_star
158
+ from src.config import get_llm
159
+
160
+ # Initialize LLM
161
+ llm = get_llm(provider="google", model="gemini-2.0-flash")
162
+
163
+ # Run DS-STAR
164
+ result = run_ds_star(
165
+ query="What is the average transaction amount?",
166
+ llm=llm,
167
+ max_iterations=20
168
+ )
169
+ ```
170
+
171
+ ---
172
+
173
+ ## 🔌 Supported Providers
174
+
175
+ | Provider | Models | Environment Variable |
176
+ |----------|--------|---------------------|
177
+ | **Google** | Gemini 2.0, 1.5 Pro, 1.5 Flash | `GOOGLE_API_KEY` |
178
+ | **OpenAI** | GPT-4o, GPT-4, GPT-3.5 | `OPENAI_API_KEY` |
179
+ | **Anthropic** | Claude 3.5, Claude 3 | `ANTHROPIC_API_KEY` |
180
+ | **Custom** | Any OpenAI-compatible API | Custom Base URL |
181
+
182
+ ---
183
+
184
+ ## 📁 Project Structure
185
+
186
+ ```
187
+ DS-STAR/
188
+ ├── 📱 app.py # Gradio web application
189
+ ├── 📜 main.py # CLI entry point
190
+ ├── 📋 requirements.txt # Dependencies
191
+ ├── 📂 src/
192
+ │ ├── 🤖 agents/ # Agent implementations
193
+ │ │ ├── analyzer_agent.py
194
+ │ │ ├── planner_agent.py
195
+ │ │ ├── coder_agent.py
196
+ │ │ ├── verifier_agent.py
197
+ │ │ ├── router_agent.py
198
+ │ │ └── finalyzer_agent.py
199
+ │ ├── 🔧 utils/ # Shared utilities
200
+ │ │ ├── state.py # State schema
201
+ │ │ ├── formatters.py # Text formatting
202
+ │ │ └── code_execution.py # Safe code execution
203
+ │ ├── ⚙️ config/ # Configuration
204
+ │ │ └── llm_config.py # LLM setup
205
+ │ └── 🔄 graph.py # LangGraph workflow
206
+ ├── 🧪 tests/ # Test suite
207
+ └── 📊 data/ # Sample data files
208
+ ```
209
+
210
+ ---
211
+
212
+ ## 🧪 Testing
213
+
214
+ ```bash
215
+ # Run complete workflow test
216
+ python tests/test_complete_workflow.py
217
+
218
+ # Test individual agents
219
+ python -c "from src.agents import test_analyzer; test_analyzer(llm)"
220
+ ```
221
+
222
+ ---
223
+
224
+ ## 🛠️ Configuration
225
+
226
+ ### Environment Variables
227
+
228
+ ```bash
229
+ # Set your API keys
230
+ export GOOGLE_API_KEY="your-google-api-key"
231
+ export OPENAI_API_KEY="your-openai-api-key"
232
+ export ANTHROPIC_API_KEY="your-anthropic-api-key"
233
+ ```
234
+
235
+ ### Advanced Settings
236
+
237
+ | Setting | Default | Description |
238
+ |---------|---------|-------------|
239
+ | Max Iterations | 20 | Maximum refinement cycles |
240
+ | Temperature | 0.0 | LLM temperature (0 = deterministic) |
241
+
242
+ ---
243
+
244
+ ## 🤝 Contributing
245
+
246
+ Contributions are welcome! Please feel free to submit a Pull Request.
247
+
248
+ 1. Fork the repository
249
+ 2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
250
+ 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
251
+ 4. Push to the branch (`git push origin feature/AmazingFeature`)
252
+ 5. Open a Pull Request
253
+
254
+ ---
255
+
256
+ ## 📄 License
257
+
258
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
259
+
260
+ ---
261
+
262
+ ## 🙏 Acknowledgments
263
+
264
+ - Built with [LangGraph](https://langchain-ai.github.io/langgraph/) by LangChain
265
+ - UI powered by [Gradio](https://gradio.app/)
266
+ - Created for the [🤗 Hugging Face MCP 1st Birthday Hackathon](https://huggingface.co/)
267
+
268
+ ---
269
+
270
+ <div align="center">
271
+
272
+ **Made with ❤️ by [Anurag Deo](https://github.com/Anurag-Deo)**
273
+
274
+ ⭐ Star this repo if you find it helpful!
275
+
276
+ </div>
277
+
app.py ADDED
@@ -0,0 +1,1406 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DS-STAR Gradio Application
3
+ A modern web interface for the DS-STAR Multi-Agent Data Science System.
4
+
5
+ Created for the Hugging Face MCP 1st Birthday Hackathon.
6
+ """
7
+
8
+ import os
9
+ import shutil
10
+ from typing import Generator
11
+
12
+ import gradio as gr
13
+ import httpx
14
+
15
+ from src.config import get_llm
16
+ from src.graph import build_ds_star_graph, create_initial_state
17
+
18
+ # ==================== MODEL FETCHING ====================
19
+
20
+ # Store fetch status for UI feedback
21
+ _last_fetch_status = {"success": False, "message": "", "from_api": False}
22
+
23
+
24
+ def fetch_google_models(api_key: str | None = None) -> tuple[list[str], str]:
25
+ """Fetch available models from Google Gemini API. Returns (models, status_message)."""
26
+ api_key = api_key or os.getenv("GOOGLE_API_KEY", "")
27
+ fallback = [
28
+ "gemini-2.0-flash",
29
+ "gemini-1.5-pro",
30
+ "gemini-1.5-flash",
31
+ "gemini-1.0-pro",
32
+ ]
33
+
34
+ if not api_key:
35
+ return fallback, "⚠️ No API key - showing default models"
36
+
37
+ try:
38
+ url = f"https://generativelanguage.googleapis.com/v1beta/models?key={api_key}"
39
+ response = httpx.get(url, timeout=15)
40
+
41
+ if response.status_code == 200:
42
+ data = response.json()
43
+ models = []
44
+ for model in data.get("models", []):
45
+ name = model.get("name", "").replace("models/", "")
46
+ # Filter for chat/generate models
47
+ if "generateContent" in model.get("supportedGenerationMethods", []):
48
+ models.append(name)
49
+ if models:
50
+ return sorted(
51
+ models, reverse=True
52
+ ), f"✅ Fetched {len(models)} models from API"
53
+ return fallback, "⚠️ No compatible models found - showing defaults"
54
+ elif response.status_code == 400:
55
+ return fallback, "❌ Invalid API key format"
56
+ elif response.status_code == 403:
57
+ return fallback, "❌ API key invalid or expired"
58
+ else:
59
+ return fallback, f"❌ API error: {response.status_code}"
60
+ except httpx.TimeoutException:
61
+ return fallback, "❌ Request timed out"
62
+ except httpx.ConnectError:
63
+ return fallback, "❌ Connection failed - check internet"
64
+ except Exception as e:
65
+ return fallback, f"❌ Error: {str(e)[:50]}"
66
+
67
+
68
+ def fetch_openai_models(
69
+ api_key: str | None = None, base_url: str | None = None
70
+ ) -> tuple[list[str], str]:
71
+ """Fetch available models from OpenAI API or compatible endpoint. Returns (models, status_message)."""
72
+ api_key = api_key or os.getenv("OPENAI_API_KEY", "")
73
+ base_url = base_url or "https://api.openai.com/v1"
74
+ fallback = ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-4", "gpt-3.5-turbo"]
75
+
76
+ if not api_key:
77
+ return fallback, "⚠️ No API key - showing default models"
78
+
79
+ try:
80
+ headers = {"Authorization": f"Bearer {api_key}"}
81
+ endpoint = f"{base_url.rstrip('/')}/models"
82
+ response = httpx.get(endpoint, headers=headers, timeout=15)
83
+
84
+ if response.status_code == 200:
85
+ data = response.json()
86
+ models = []
87
+ for model in data.get("data", []):
88
+ model_id = model.get("id", "")
89
+ # For OpenAI, filter chat models; for custom endpoints, include all
90
+ if base_url != "https://api.openai.com/v1":
91
+ models.append(model_id)
92
+ elif model_id.startswith(("gpt-", "o1", "o3", "o4", "chatgpt")):
93
+ models.append(model_id)
94
+ if models:
95
+ return sorted(
96
+ models, reverse=True
97
+ ), f"✅ Fetched {len(models)} models from API"
98
+ return fallback, "⚠️ No chat models found - showing defaults"
99
+ elif response.status_code == 401:
100
+ return fallback, "❌ Invalid API key"
101
+ elif response.status_code == 403:
102
+ return fallback, "❌ Access denied"
103
+ else:
104
+ return fallback, f"❌ API error: {response.status_code}"
105
+ except httpx.TimeoutException:
106
+ return fallback, "❌ Request timed out"
107
+ except httpx.ConnectError:
108
+ return fallback, "❌ Connection failed - check URL/internet"
109
+ except Exception as e:
110
+ return fallback, f"❌ Error: {str(e)[:50]}"
111
+
112
+
113
+ def fetch_anthropic_models(api_key: str | None = None) -> tuple[list[str], str]:
114
+ """Fetch available models from Anthropic API. Returns (models, status_message)."""
115
+ api_key = api_key or os.getenv("ANTHROPIC_API_KEY", "")
116
+ fallback = [
117
+ "claude-sonnet-4-20250514",
118
+ "claude-3-5-sonnet-20241022",
119
+ "claude-3-5-haiku-20241022",
120
+ "claude-3-opus-20240229",
121
+ ]
122
+
123
+ if not api_key:
124
+ return fallback, "⚠️ No API key - showing default models"
125
+
126
+ try:
127
+ headers = {"x-api-key": api_key, "anthropic-version": "2023-06-01"}
128
+ response = httpx.get(
129
+ "https://api.anthropic.com/v1/models", headers=headers, timeout=15
130
+ )
131
+
132
+ if response.status_code == 200:
133
+ data = response.json()
134
+ models = [
135
+ model.get("id", "") for model in data.get("data", []) if model.get("id")
136
+ ]
137
+ if models:
138
+ return sorted(
139
+ models, reverse=True
140
+ ), f"✅ Fetched {len(models)} models from API"
141
+ return fallback, "⚠️ No models found - showing defaults"
142
+ elif response.status_code == 401:
143
+ return fallback, "❌ Invalid API key"
144
+ elif response.status_code == 403:
145
+ return fallback, "❌ Access denied"
146
+ else:
147
+ # Anthropic may not have a public models endpoint, use fallback
148
+ return fallback, "ℹ️ Using known Anthropic models"
149
+ except httpx.TimeoutException:
150
+ return fallback, "❌ Request timed out"
151
+ except httpx.ConnectError:
152
+ return fallback, "❌ Connection failed"
153
+ except Exception as e:
154
+ return fallback, f"❌ Error: {str(e)[:50]}"
155
+
156
+
157
+ def fetch_models_for_provider(
158
+ provider: str, api_key: str | None = None, base_url: str | None = None
159
+ ) -> tuple[list[str], str]:
160
+ """Fetch models for the given provider. Returns (models, status_message)."""
161
+ if provider == "google":
162
+ return fetch_google_models(api_key)
163
+ elif provider == "openai":
164
+ return fetch_openai_models(api_key)
165
+ elif provider == "anthropic":
166
+ return fetch_anthropic_models(api_key)
167
+ elif provider == "custom":
168
+ if not base_url:
169
+ return [], "❌ Please enter a Base URL for custom provider"
170
+ return fetch_openai_models(api_key, base_url)
171
+ return [], "❌ Unknown provider"
172
+
173
+
174
+ # ==================== CUSTOM THEME ====================
175
+
176
+
177
+ def create_ds_star_theme():
178
+ """Create a modern dark theme for DS-STAR."""
179
+ return gr.themes.Base(
180
+ primary_hue=gr.themes.colors.violet,
181
+ secondary_hue=gr.themes.colors.purple,
182
+ neutral_hue=gr.themes.colors.slate,
183
+ font=[
184
+ gr.themes.GoogleFont("Inter"),
185
+ "ui-sans-serif",
186
+ "system-ui",
187
+ "sans-serif",
188
+ ],
189
+ font_mono=[gr.themes.GoogleFont("JetBrains Mono"), "ui-monospace", "monospace"],
190
+ ).set(
191
+ # Body - Dark background
192
+ body_background_fill="#0a0a0f",
193
+ body_background_fill_dark="#0a0a0f",
194
+ body_text_color="#e4e4e7",
195
+ body_text_color_dark="#e4e4e7",
196
+ # Buttons
197
+ button_primary_background_fill="linear-gradient(135deg, #7c3aed 0%, #8b5cf6 100%)",
198
+ button_primary_background_fill_hover="linear-gradient(135deg, #6d28d9 0%, #7c3aed 100%)",
199
+ button_primary_text_color="white",
200
+ button_primary_border_color="transparent",
201
+ button_secondary_background_fill="transparent",
202
+ button_secondary_background_fill_hover="rgba(139, 92, 246, 0.15)",
203
+ button_secondary_border_color="#7c3aed",
204
+ button_secondary_text_color="#a78bfa",
205
+ # Blocks
206
+ block_background_fill="#18181b",
207
+ block_background_fill_dark="#18181b",
208
+ block_border_width="1px",
209
+ block_border_color="#27272a",
210
+ block_border_color_dark="#27272a",
211
+ block_shadow="none",
212
+ block_title_text_weight="600",
213
+ block_title_text_size="*text_md",
214
+ block_label_text_weight="500",
215
+ block_label_text_size="*text_sm",
216
+ block_radius="12px",
217
+ block_padding="16px",
218
+ # Inputs
219
+ input_background_fill="#27272a",
220
+ input_background_fill_dark="#27272a",
221
+ input_border_color="#3f3f46",
222
+ input_border_color_dark="#3f3f46",
223
+ input_border_width="1px",
224
+ input_shadow="none",
225
+ input_radius="8px",
226
+ # Panels
227
+ panel_background_fill="#18181b",
228
+ panel_background_fill_dark="#18181b",
229
+ panel_border_width="0px",
230
+ # Spacing
231
+ layout_gap="16px",
232
+ # Shadows
233
+ shadow_drop="none",
234
+ shadow_drop_lg="none",
235
+ # Checkbox
236
+ checkbox_background_color="#27272a",
237
+ checkbox_background_color_dark="#27272a",
238
+ checkbox_border_color="#3f3f46",
239
+ checkbox_border_color_dark="#3f3f46",
240
+ checkbox_label_text_color="#a1a1aa",
241
+ checkbox_label_text_color_dark="#a1a1aa",
242
+ # Slider
243
+ slider_color="#7c3aed",
244
+ slider_color_dark="#8b5cf6",
245
+ )
246
+
247
+
248
+ # ==================== CSS STYLING ====================
249
+
250
+ CUSTOM_CSS = """
251
+ /* Modern Font Import */
252
+ @import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&display=swap');
253
+ @import url('https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;600&display=swap');
254
+
255
+ /* Root variables */
256
+ :root {
257
+ --bg-primary: #0a0a0f;
258
+ --bg-secondary: #18181b;
259
+ --bg-tertiary: #27272a;
260
+ --border-color: #3f3f46;
261
+ --text-primary: #fafafa;
262
+ --text-secondary: #a1a1aa;
263
+ --text-muted: #71717a;
264
+ --accent-primary: #8b5cf6;
265
+ --accent-secondary: #7c3aed;
266
+ --accent-glow: rgba(139, 92, 246, 0.3);
267
+ --success: #22c55e;
268
+ --error: #ef4444;
269
+ }
270
+
271
+ /* Main container - Dark background */
272
+ .gradio-container {
273
+ max-width: 1400px !important;
274
+ margin: 0 auto !important;
275
+ padding: 32px !important;
276
+ font-family: 'Inter', sans-serif !important;
277
+ background: var(--bg-primary) !important;
278
+ min-height: 100vh;
279
+ }
280
+
281
+ /* Remove all default shadows and borders for cleaner look */
282
+ .gradio-container * {
283
+ box-shadow: none !important;
284
+ }
285
+
286
+ /* ===== HEADER ===== */
287
+ .header-section {
288
+ background: linear-gradient(135deg, #1e1b4b 0%, #312e81 50%, #3730a3 100%);
289
+ border-radius: 16px;
290
+ padding: 40px 32px;
291
+ margin-bottom: 24px;
292
+ border: 1px solid #4338ca;
293
+ position: relative;
294
+ overflow: hidden;
295
+ }
296
+
297
+ .header-section::before {
298
+ content: '';
299
+ position: absolute;
300
+ top: 0;
301
+ right: 0;
302
+ width: 40%;
303
+ height: 100%;
304
+ background: radial-gradient(ellipse at top right, rgba(139, 92, 246, 0.3), transparent 70%);
305
+ }
306
+
307
+ .header-content {
308
+ position: relative;
309
+ z-index: 1;
310
+ text-align: center;
311
+ }
312
+
313
+ .header-title {
314
+ font-size: 2.75rem;
315
+ font-weight: 800;
316
+ color: #fff;
317
+ margin: 0 0 8px 0;
318
+ letter-spacing: -0.02em;
319
+ display: flex;
320
+ align-items: center;
321
+ justify-content: center;
322
+ gap: 12px;
323
+ }
324
+
325
+ .header-title .star-icon {
326
+ font-size: 2.2rem;
327
+ }
328
+
329
+ .header-subtitle {
330
+ font-size: 1.1rem;
331
+ color: rgba(255, 255, 255, 0.7);
332
+ margin: 0;
333
+ font-weight: 400;
334
+ }
335
+
336
+ .header-badges {
337
+ display: flex;
338
+ justify-content: center;
339
+ gap: 10px;
340
+ margin-top: 20px;
341
+ flex-wrap: wrap;
342
+ }
343
+
344
+ .header-badge {
345
+ padding: 6px 14px;
346
+ background: rgba(255, 255, 255, 0.1);
347
+ border: 1px solid rgba(255, 255, 255, 0.15);
348
+ border-radius: 20px;
349
+ font-size: 0.8rem;
350
+ font-weight: 500;
351
+ color: rgba(255, 255, 255, 0.9);
352
+ }
353
+
354
+ /* ===== CARDS & GROUPS ===== */
355
+ .dark-card {
356
+ background: var(--bg-secondary) !important;
357
+ border: 1px solid var(--border-color) !important;
358
+ border-radius: 12px !important;
359
+ padding: 20px !important;
360
+ margin-bottom: 16px !important;
361
+ }
362
+
363
+ /* Group styling */
364
+ .gr-group {
365
+ background: var(--bg-secondary) !important;
366
+ border: 1px solid var(--border-color) !important;
367
+ border-radius: 12px !important;
368
+ padding: 20px !important;
369
+ }
370
+
371
+ /* ===== SECTION HEADERS ===== */
372
+ .section-header {
373
+ display: flex;
374
+ align-items: center;
375
+ gap: 10px;
376
+ margin-bottom: 16px;
377
+ padding-bottom: 12px;
378
+ border-bottom: 1px solid var(--border-color);
379
+ }
380
+
381
+ .section-icon {
382
+ width: 32px;
383
+ height: 32px;
384
+ display: flex;
385
+ align-items: center;
386
+ justify-content: center;
387
+ background: linear-gradient(135deg, var(--accent-secondary), var(--accent-primary));
388
+ border-radius: 8px;
389
+ font-size: 1rem;
390
+ }
391
+
392
+ .section-title {
393
+ font-size: 1.1rem;
394
+ font-weight: 600;
395
+ color: var(--text-primary);
396
+ margin: 0;
397
+ }
398
+
399
+ /* ===== ACCORDIONS ===== */
400
+ .gr-accordion {
401
+ border: 1px solid var(--border-color) !important;
402
+ border-radius: 10px !important;
403
+ overflow: hidden !important;
404
+ margin-bottom: 12px !important;
405
+ background: var(--bg-secondary) !important;
406
+ }
407
+
408
+ .gr-accordion > .label-wrap {
409
+ background: var(--bg-tertiary) !important;
410
+ padding: 12px 16px !important;
411
+ cursor: pointer !important;
412
+ border: none !important;
413
+ }
414
+
415
+ .gr-accordion > .label-wrap:hover {
416
+ background: #323238 !important;
417
+ }
418
+
419
+ .gr-accordion > .label-wrap span {
420
+ font-weight: 600 !important;
421
+ font-size: 0.95rem !important;
422
+ color: var(--text-primary) !important;
423
+ }
424
+
425
+ .gr-accordion > div:last-child {
426
+ padding: 16px !important;
427
+ background: var(--bg-secondary) !important;
428
+ }
429
+
430
+ /* ===== BUTTONS ===== */
431
+ button.primary, .gr-button-primary {
432
+ background: linear-gradient(135deg, var(--accent-secondary), var(--accent-primary)) !important;
433
+ border: none !important;
434
+ border-radius: 8px !important;
435
+ padding: 10px 20px !important;
436
+ font-weight: 600 !important;
437
+ font-size: 0.9rem !important;
438
+ color: white !important;
439
+ cursor: pointer !important;
440
+ transition: all 0.2s ease !important;
441
+ }
442
+
443
+ button.primary:hover, .gr-button-primary:hover {
444
+ opacity: 0.9 !important;
445
+ transform: translateY(-1px) !important;
446
+ }
447
+
448
+ button.secondary, .gr-button-secondary {
449
+ background: transparent !important;
450
+ border: 1px solid var(--accent-primary) !important;
451
+ border-radius: 8px !important;
452
+ color: var(--accent-primary) !important;
453
+ font-weight: 500 !important;
454
+ padding: 10px 20px !important;
455
+ transition: all 0.2s ease !important;
456
+ }
457
+
458
+ button.secondary:hover, .gr-button-secondary:hover {
459
+ background: rgba(139, 92, 246, 0.1) !important;
460
+ }
461
+
462
+ /* Refresh button */
463
+ .refresh-btn {
464
+ min-width: 40px !important;
465
+ width: 40px !important;
466
+ height: 40px !important;
467
+ padding: 0 !important;
468
+ display: flex !important;
469
+ align-items: center !important;
470
+ justify-content: center !important;
471
+ border-radius: 8px !important;
472
+ background: var(--bg-tertiary) !important;
473
+ border: 1px solid var(--border-color) !important;
474
+ font-size: 1rem !important;
475
+ color: var(--text-secondary) !important;
476
+ transition: all 0.2s ease !important;
477
+ }
478
+
479
+ .refresh-btn:hover {
480
+ background: var(--accent-primary) !important;
481
+ border-color: var(--accent-primary) !important;
482
+ color: white !important;
483
+ }
484
+
485
+ /* ===== INPUTS ===== */
486
+ input, textarea, select, .gr-input, .gr-text-input {
487
+ background: var(--bg-tertiary) !important;
488
+ border: 1px solid var(--border-color) !important;
489
+ border-radius: 8px !important;
490
+ color: var(--text-primary) !important;
491
+ padding: 10px 14px !important;
492
+ font-size: 0.9rem !important;
493
+ transition: border-color 0.2s ease !important;
494
+ }
495
+
496
+ input:focus, textarea:focus, select:focus {
497
+ border-color: var(--accent-primary) !important;
498
+ outline: none !important;
499
+ }
500
+
501
+ input::placeholder, textarea::placeholder {
502
+ color: var(--text-muted) !important;
503
+ }
504
+
505
+ /* Dropdown */
506
+ .gr-dropdown {
507
+ background: var(--bg-tertiary) !important;
508
+ }
509
+
510
+ /* ===== TABS ===== */
511
+ .gr-tabs {
512
+ background: transparent !important;
513
+ }
514
+
515
+ .gr-tab-nav {
516
+ background: var(--bg-secondary) !important;
517
+ border-radius: 10px !important;
518
+ padding: 4px !important;
519
+ margin-bottom: 16px !important;
520
+ border: 1px solid var(--border-color) !important;
521
+ gap: 4px !important;
522
+ display: flex !important;
523
+ overflow-x: auto !important;
524
+ }
525
+
526
+ .gr-tab-nav button {
527
+ border-radius: 8px !important;
528
+ padding: 10px 16px !important;
529
+ font-weight: 500 !important;
530
+ font-size: 0.85rem !important;
531
+ background: transparent !important;
532
+ border: none !important;
533
+ color: var(--text-secondary) !important;
534
+ white-space: nowrap !important;
535
+ transition: all 0.2s ease !important;
536
+ }
537
+
538
+ .gr-tab-nav button:hover {
539
+ background: var(--bg-tertiary) !important;
540
+ color: var(--text-primary) !important;
541
+ }
542
+
543
+ .gr-tab-nav button.selected {
544
+ background: var(--accent-primary) !important;
545
+ color: white !important;
546
+ }
547
+
548
+ /* ===== CODE OUTPUT ===== */
549
+ .code-wrap, .gr-code {
550
+ border-radius: 10px !important;
551
+ overflow: hidden !important;
552
+ border: 1px solid var(--border-color) !important;
553
+ }
554
+
555
+ .code-wrap pre, .gr-code pre {
556
+ background: #0d0d12 !important;
557
+ padding: 16px !important;
558
+ margin: 0 !important;
559
+ font-family: 'JetBrains Mono', monospace !important;
560
+ font-size: 0.85rem !important;
561
+ line-height: 1.5 !important;
562
+ max-height: 400px !important;
563
+ overflow: auto !important;
564
+ }
565
+
566
+ /* ===== FILE UPLOAD ===== */
567
+ .file-upload, .gr-file {
568
+ border: 2px dashed var(--border-color) !important;
569
+ border-radius: 10px !important;
570
+ background: var(--bg-tertiary) !important;
571
+ padding: 24px !important;
572
+ transition: all 0.2s ease !important;
573
+ }
574
+
575
+ .file-upload:hover, .gr-file:hover {
576
+ border-color: var(--accent-primary) !important;
577
+ }
578
+
579
+ /* ===== STATUS DISPLAYS ===== */
580
+ .status-box textarea {
581
+ font-weight: 600 !important;
582
+ color: var(--accent-primary) !important;
583
+ background: rgba(139, 92, 246, 0.1) !important;
584
+ border: 1px solid rgba(139, 92, 246, 0.3) !important;
585
+ border-radius: 8px !important;
586
+ }
587
+
588
+ .step-box textarea {
589
+ font-family: 'JetBrains Mono', monospace !important;
590
+ font-size: 0.85rem !important;
591
+ color: var(--text-secondary) !important;
592
+ background: var(--bg-tertiary) !important;
593
+ border-radius: 8px !important;
594
+ }
595
+
596
+ /* ===== EXAMPLES - Redesigned as chips ===== */
597
+ .gr-examples {
598
+ margin-top: 16px !important;
599
+ padding-top: 16px !important;
600
+ border-top: 1px solid var(--border-color) !important;
601
+ }
602
+
603
+ .gr-examples .label {
604
+ font-size: 0.85rem !important;
605
+ font-weight: 600 !important;
606
+ color: var(--text-secondary) !important;
607
+ margin-bottom: 12px !important;
608
+ }
609
+
610
+ .gr-examples-table {
611
+ display: flex !important;
612
+ flex-wrap: wrap !important;
613
+ gap: 8px !important;
614
+ }
615
+
616
+ .gr-examples-table tbody {
617
+ display: flex !important;
618
+ flex-wrap: wrap !important;
619
+ gap: 8px !important;
620
+ }
621
+
622
+ .gr-examples-table tr {
623
+ display: contents !important;
624
+ }
625
+
626
+ .gr-examples-table td {
627
+ display: block !important;
628
+ padding: 0 !important;
629
+ }
630
+
631
+ .gr-examples-table button, .gr-samples-table button {
632
+ background: var(--bg-tertiary) !important;
633
+ border: 1px solid var(--border-color) !important;
634
+ border-radius: 20px !important;
635
+ padding: 8px 16px !important;
636
+ font-size: 0.8rem !important;
637
+ color: var(--text-secondary) !important;
638
+ cursor: pointer !important;
639
+ transition: all 0.2s ease !important;
640
+ white-space: nowrap !important;
641
+ max-width: none !important;
642
+ }
643
+
644
+ .gr-examples-table button:hover, .gr-samples-table button:hover {
645
+ background: var(--accent-primary) !important;
646
+ border-color: var(--accent-primary) !important;
647
+ color: white !important;
648
+ }
649
+
650
+ /* ===== FILE LIST ===== */
651
+ .file-list {
652
+ background: var(--bg-tertiary) !important;
653
+ border-radius: 8px !important;
654
+ padding: 12px !important;
655
+ font-size: 0.85rem !important;
656
+ color: var(--text-secondary) !important;
657
+ max-height: 120px !important;
658
+ overflow-y: auto !important;
659
+ }
660
+
661
+ /* ===== WORKFLOW STEPS ===== */
662
+ .workflow-container {
663
+ display: grid;
664
+ grid-template-columns: repeat(2, 1fr);
665
+ gap: 12px;
666
+ padding: 16px;
667
+ }
668
+
669
+ .workflow-step {
670
+ display: flex;
671
+ align-items: center;
672
+ gap: 12px;
673
+ padding: 12px 14px;
674
+ background: var(--bg-tertiary);
675
+ border: 1px solid var(--border-color);
676
+ border-radius: 10px;
677
+ transition: all 0.2s ease;
678
+ }
679
+
680
+ .workflow-step:hover {
681
+ border-color: var(--accent-primary);
682
+ }
683
+
684
+ .step-number {
685
+ width: 28px;
686
+ height: 28px;
687
+ display: flex;
688
+ align-items: center;
689
+ justify-content: center;
690
+ background: linear-gradient(135deg, var(--accent-secondary), var(--accent-primary));
691
+ border-radius: 6px;
692
+ color: white;
693
+ font-weight: 700;
694
+ font-size: 0.8rem;
695
+ flex-shrink: 0;
696
+ }
697
+
698
+ .step-content {
699
+ flex: 1;
700
+ min-width: 0;
701
+ }
702
+
703
+ .step-title {
704
+ font-weight: 600;
705
+ color: var(--text-primary);
706
+ font-size: 0.85rem;
707
+ }
708
+
709
+ .step-desc {
710
+ font-size: 0.75rem;
711
+ color: var(--text-muted);
712
+ margin-top: 2px;
713
+ white-space: nowrap;
714
+ overflow: hidden;
715
+ text-overflow: ellipsis;
716
+ }
717
+
718
+ /* ===== SCROLLBAR ===== */
719
+ ::-webkit-scrollbar {
720
+ width: 6px;
721
+ height: 6px;
722
+ }
723
+
724
+ ::-webkit-scrollbar-track {
725
+ background: var(--bg-tertiary);
726
+ border-radius: 3px;
727
+ }
728
+
729
+ ::-webkit-scrollbar-thumb {
730
+ background: var(--border-color);
731
+ border-radius: 3px;
732
+ }
733
+
734
+ ::-webkit-scrollbar-thumb:hover {
735
+ background: #52525b;
736
+ }
737
+
738
+ /* ===== LAYOUT FIXES ===== */
739
+ .gr-row {
740
+ gap: 12px !important;
741
+ }
742
+
743
+ .gr-column {
744
+ gap: 12px !important;
745
+ }
746
+
747
+ /* Remove excess padding/margins */
748
+ .gr-form {
749
+ gap: 12px !important;
750
+ }
751
+
752
+ .gr-block {
753
+ padding: 0 !important;
754
+ }
755
+
756
+ /* Label styling */
757
+ label, .gr-input-label {
758
+ font-size: 0.85rem !important;
759
+ font-weight: 500 !important;
760
+ color: var(--text-secondary) !important;
761
+ margin-bottom: 6px !important;
762
+ }
763
+
764
+ /* Info text */
765
+ .gr-info {
766
+ font-size: 0.75rem !important;
767
+ color: var(--text-muted) !important;
768
+ }
769
+
770
+ /* Fix button alignment in rows */
771
+ .gr-button {
772
+ height: 40px !important;
773
+ }
774
+
775
+ /* Slider styling */
776
+ input[type="range"] {
777
+ accent-color: var(--accent-primary) !important;
778
+ }
779
+
780
+ .gr-slider input {
781
+ background: var(--bg-tertiary) !important;
782
+ }
783
+
784
+ /* ===== RESPONSIVE ===== */
785
+ @media (max-width: 768px) {
786
+ .header-title {
787
+ font-size: 1.75rem;
788
+ }
789
+
790
+ .gradio-container {
791
+ padding: 16px !important;
792
+ }
793
+
794
+ .workflow-container {
795
+ grid-template-columns: 1fr;
796
+ }
797
+ }
798
+ """
799
+
800
+
801
+ # ==================== HELPER FUNCTIONS ====================
802
+
803
+
804
+ def validate_api_key(
805
+ provider: str, api_key: str, base_url: str = ""
806
+ ) -> tuple[bool, str]:
807
+ """Validate that the API key is provided for the selected provider."""
808
+ if provider == "custom":
809
+ if not api_key or api_key.strip() == "":
810
+ return False, "❌ Please provide an API key for custom provider"
811
+ if not base_url or base_url.strip() == "":
812
+ return False, "❌ Please provide a Base URL for custom provider"
813
+ return True, "✅ Custom provider configured"
814
+
815
+ if not api_key or api_key.strip() == "":
816
+ env_var = {
817
+ "google": "GOOGLE_API_KEY",
818
+ "openai": "OPENAI_API_KEY",
819
+ "anthropic": "ANTHROPIC_API_KEY",
820
+ }.get(provider, "")
821
+
822
+ # Check environment variable
823
+ if os.getenv(env_var):
824
+ return True, f"✅ Using API key from environment variable ({env_var})"
825
+ return (
826
+ False,
827
+ f"❌ Please provide an API key or set the {env_var} environment variable",
828
+ )
829
+
830
+ return True, "✅ API key provided"
831
+
832
+
833
+ def get_model_choices(
834
+ provider: str, api_key: str | None = None, base_url: str | None = None
835
+ ) -> list[str]:
836
+ """Get available models for each provider (fallback list)."""
837
+ fallback_models = {
838
+ "google": [
839
+ "gemini-2.0-flash",
840
+ "gemini-1.5-pro",
841
+ "gemini-1.5-flash",
842
+ ],
843
+ "openai": [
844
+ "gpt-4o",
845
+ "gpt-4o-mini",
846
+ "gpt-4-turbo",
847
+ "gpt-4",
848
+ "gpt-3.5-turbo",
849
+ ],
850
+ "anthropic": [
851
+ "claude-sonnet-4-20250514",
852
+ "claude-3-5-sonnet-20241022",
853
+ "claude-3-5-haiku-20241022",
854
+ "claude-3-opus-20240229",
855
+ ],
856
+ "custom": [],
857
+ }
858
+ return fallback_models.get(provider, [])
859
+
860
+
861
+ def update_model_dropdown(provider: str, api_key: str = "", base_url: str = ""):
862
+ """Update model dropdown when provider changes."""
863
+ # Fetch models with status
864
+ models, status = fetch_models_for_provider(
865
+ provider,
866
+ api_key.strip() if api_key and api_key.strip() else None,
867
+ base_url.strip() if base_url and base_url.strip() else None,
868
+ )
869
+
870
+ if not models:
871
+ models = get_model_choices(provider)
872
+
873
+ return gr.update(choices=models, value=models[0] if models else None)
874
+
875
+
876
+ def update_base_url_visibility(provider: str):
877
+ """Show/hide base URL field based on provider."""
878
+ return gr.update(visible=(provider == "custom"))
879
+
880
+
881
+ def refresh_models(provider: str, api_key: str, base_url: str):
882
+ """Refresh the model list from the API."""
883
+ models, status = fetch_models_for_provider(
884
+ provider,
885
+ api_key.strip() if api_key and api_key.strip() else None,
886
+ base_url.strip() if base_url and base_url.strip() else None,
887
+ )
888
+
889
+ if models:
890
+ return gr.update(choices=models, value=models[0]), status
891
+
892
+ # Use fallback if no models returned
893
+ fallback = get_model_choices(provider)
894
+ if fallback:
895
+ return gr.update(
896
+ choices=fallback, value=fallback[0]
897
+ ), status or "ℹ️ Using default models"
898
+
899
+ return gr.update(), status or "❌ No models available"
900
+
901
+
902
+ def copy_uploaded_files(files: list) -> str:
903
+ """Copy uploaded files to the data directory."""
904
+ data_dir = os.path.join(os.path.dirname(__file__), "data")
905
+
906
+ # Clear existing files in data directory (except .gitkeep)
907
+ if os.path.exists(data_dir):
908
+ for f in os.listdir(data_dir):
909
+ if f != ".gitkeep":
910
+ file_path = os.path.join(data_dir, f)
911
+ if os.path.isfile(file_path):
912
+ os.remove(file_path)
913
+ else:
914
+ os.makedirs(data_dir)
915
+
916
+ # Copy new files
917
+ copied_files = []
918
+ if files:
919
+ for file_path in files:
920
+ if file_path:
921
+ filename = os.path.basename(file_path)
922
+ dest_path = os.path.join(data_dir, filename)
923
+ shutil.copy2(file_path, dest_path)
924
+ copied_files.append(filename)
925
+
926
+ if copied_files:
927
+ return f"✅ Uploaded {len(copied_files)} file(s): {', '.join(copied_files)}"
928
+ return "ℹ️ No files uploaded. Using existing files in data/ directory."
929
+
930
+
931
+ def list_data_files() -> str:
932
+ """List files currently in the data directory."""
933
+ data_dir = os.path.join(os.path.dirname(__file__), "data")
934
+ if not os.path.exists(data_dir):
935
+ return "No data directory found."
936
+
937
+ files = [
938
+ f
939
+ for f in os.listdir(data_dir)
940
+ if f != ".gitkeep" and os.path.isfile(os.path.join(data_dir, f))
941
+ ]
942
+ if files:
943
+ file_list = "\n".join([f" 📄 {f}" for f in files])
944
+ return f"**Files in data/ directory:**\n{file_list}"
945
+ return "No data files found. Please upload some files."
946
+
947
+
948
+ # ==================== MAIN WORKFLOW ====================
949
+
950
+
951
+ def run_ds_star_workflow(
952
+ query: str,
953
+ provider: str,
954
+ model: str,
955
+ api_key: str,
956
+ base_url: str,
957
+ max_iterations: int,
958
+ temperature: float,
959
+ progress=gr.Progress(),
960
+ ) -> Generator[tuple[str, str, str, str], None, None]:
961
+ """
962
+ Run the DS-STAR workflow with streaming updates.
963
+
964
+ Yields: (status, current_step, code_output, execution_result)
965
+ """
966
+ # Validate inputs
967
+ if not query or query.strip() == "":
968
+ yield "❌ Error", "Please enter a query", "", ""
969
+ return
970
+
971
+ is_valid, message = validate_api_key(provider, api_key, base_url)
972
+ if not is_valid:
973
+ yield "❌ Configuration Error", message, "", ""
974
+ return
975
+
976
+ # Initialize LLM
977
+ yield "🔄 Initializing...", "Setting up LLM connection", "", ""
978
+
979
+ # For custom provider, use openai with custom base_url
980
+ actual_provider = "openai" if provider == "custom" else provider
981
+
982
+ try:
983
+ llm = get_llm(
984
+ provider=actual_provider,
985
+ model=model,
986
+ api_key=api_key if api_key.strip() else None,
987
+ temperature=temperature,
988
+ base_url=base_url if provider == "custom" and base_url.strip() else None,
989
+ )
990
+ except Exception as e:
991
+ yield "❌ LLM Error", f"Failed to initialize LLM: {str(e)}", "", ""
992
+ return
993
+
994
+ # Build graph
995
+ yield "🔄 Building Graph...", "Constructing multi-agent workflow", "", ""
996
+
997
+ try:
998
+ app = build_ds_star_graph(llm, max_iterations)
999
+ except Exception as e:
1000
+ yield "❌ Graph Error", f"Failed to build graph: {str(e)}", "", ""
1001
+ return
1002
+
1003
+ # Create initial state
1004
+ initial_state = create_initial_state(query, llm, max_iterations)
1005
+ config = {"configurable": {"thread_id": f"gradio-session-{os.urandom(4).hex()}"}}
1006
+
1007
+ # Run workflow with progress updates
1008
+ step_descriptions = {
1009
+ "analyzer": "📊 Analyzing data files...",
1010
+ "planner": "📝 Creating execution plan...",
1011
+ "coder": "💻 Generating code...",
1012
+ "verifier": "✅ Verifying solution...",
1013
+ "router": "🔀 Routing to next step...",
1014
+ "backtrack": "↩️ Backtracking...",
1015
+ "finalyzer": "🎯 Finalizing solution...",
1016
+ }
1017
+
1018
+ yield "🚀 Running DS-STAR...", "Starting multi-agent workflow", "", ""
1019
+
1020
+ try:
1021
+ # Stream through the workflow
1022
+ current_code = ""
1023
+ current_result = ""
1024
+ iteration = 0
1025
+
1026
+ for event in app.stream(initial_state, config, stream_mode="values"):
1027
+ # Update progress based on current state
1028
+ next_node = event.get("next", "")
1029
+ iteration = event.get("iteration", 0)
1030
+
1031
+ step_desc = step_descriptions.get(next_node, f"Processing: {next_node}")
1032
+ progress_msg = (
1033
+ f"Iteration {iteration}/{max_iterations}"
1034
+ if iteration > 0
1035
+ else "Starting..."
1036
+ )
1037
+
1038
+ current_code = event.get("current_code", current_code) or ""
1039
+ current_result = event.get("execution_result", current_result) or ""
1040
+
1041
+ yield f"🔄 {progress_msg}", step_desc, current_code, current_result
1042
+
1043
+ # Final state
1044
+ final_code = event.get("current_code", "") or ""
1045
+ final_result = event.get("execution_result", "") or ""
1046
+
1047
+ yield "✅ Complete!", "Workflow finished successfully", final_code, final_result
1048
+
1049
+ except Exception as e:
1050
+ import traceback
1051
+
1052
+ error_trace = traceback.format_exc()
1053
+ yield (
1054
+ "❌ Execution Error",
1055
+ f"Error: {str(e)}\n\n{error_trace}",
1056
+ current_code,
1057
+ current_result,
1058
+ )
1059
+
1060
+
1061
+ # ==================== GRADIO INTERFACE ====================
1062
+
1063
+
1064
+ def create_gradio_app():
1065
+ """Create and configure the Gradio application."""
1066
+
1067
+ with gr.Blocks(title="DS-STAR | Multi-Agent Data Science") as demo:
1068
+ # Header Section
1069
+ gr.HTML("""
1070
+ <div class="header-section">
1071
+ <div class="header-content">
1072
+ <h1 class="header-title">
1073
+ <span class="star-icon">✨</span>
1074
+ DS-STAR
1075
+ </h1>
1076
+ <p class="header-subtitle">Multi-Agent System for Automated Data Science Tasks</p>
1077
+ <div class="header-badges">
1078
+ <span class="header-badge">🔗 LangGraph</span>
1079
+ <span class="header-badge">🤗 HuggingFace MCP</span>
1080
+ <span class="header-badge">🤖 Multi-Agent</span>
1081
+ </div>
1082
+ </div>
1083
+ </div>
1084
+ """)
1085
+
1086
+ with gr.Row():
1087
+ # Left Column - Configuration (narrower)
1088
+ with gr.Column(scale=1, min_width=300):
1089
+ # LLM Configuration - Accordion
1090
+ with gr.Accordion("🔑 LLM Configuration", open=True):
1091
+ provider = gr.Dropdown(
1092
+ choices=["google", "openai", "anthropic", "custom"],
1093
+ value="google",
1094
+ label="Provider",
1095
+ info="Select your LLM provider",
1096
+ )
1097
+
1098
+ base_url = gr.Textbox(
1099
+ label="Base URL",
1100
+ placeholder="https://api.together.xyz/v1",
1101
+ info="OpenAI-compatible API endpoint",
1102
+ visible=False,
1103
+ )
1104
+
1105
+ with gr.Row():
1106
+ model = gr.Dropdown(
1107
+ choices=get_model_choices("google"),
1108
+ value="gemini-2.0-flash",
1109
+ label="Model",
1110
+ scale=5,
1111
+ )
1112
+ refresh_models_btn = gr.Button(
1113
+ "🔄",
1114
+ variant="secondary",
1115
+ size="sm",
1116
+ scale=1,
1117
+ min_width=40,
1118
+ elem_classes="refresh-btn",
1119
+ )
1120
+
1121
+ api_key = gr.Textbox(
1122
+ label="API Key",
1123
+ type="password",
1124
+ placeholder="Enter API key or use env variable",
1125
+ )
1126
+
1127
+ api_status = gr.Markdown(
1128
+ "💡 *Enter API key or set environment variable*"
1129
+ )
1130
+
1131
+ # Advanced Settings - Accordion (closed by default)
1132
+ with gr.Accordion("⚙️ Advanced Settings", open=False):
1133
+ max_iterations = gr.Slider(
1134
+ minimum=1,
1135
+ maximum=50,
1136
+ value=20,
1137
+ step=1,
1138
+ label="Max Iterations",
1139
+ info="Maximum refinement cycles",
1140
+ )
1141
+
1142
+ temperature = gr.Slider(
1143
+ minimum=0.0,
1144
+ maximum=1.0,
1145
+ value=0.0,
1146
+ step=0.1,
1147
+ label="Temperature",
1148
+ info="Controls response creativity (0 = deterministic)",
1149
+ )
1150
+
1151
+ # Data Files - Accordion
1152
+ with gr.Accordion("📁 Data Files", open=True):
1153
+ file_upload = gr.File(
1154
+ label="Upload Files",
1155
+ file_count="multiple",
1156
+ file_types=[".csv", ".json", ".xlsx", ".parquet", ".txt"],
1157
+ type="filepath",
1158
+ )
1159
+
1160
+ upload_status = gr.Markdown(
1161
+ list_data_files(), elem_classes="file-list"
1162
+ )
1163
+
1164
+ refresh_btn = gr.Button(
1165
+ "🔄 Refresh Files", variant="secondary", size="sm"
1166
+ )
1167
+
1168
+ # Right Column - Main Interface (wider)
1169
+ with gr.Column(scale=2, min_width=500):
1170
+ # Query Input Section
1171
+ gr.HTML("""
1172
+ <div class="section-header">
1173
+ <div class="section-icon">💬</div>
1174
+ <h3 class="section-title">Ask Your Question</h3>
1175
+ </div>
1176
+ """)
1177
+
1178
+ query_input = gr.Textbox(
1179
+ label="",
1180
+ placeholder="e.g., What percentage of transactions use credit cards? Show the distribution by category.",
1181
+ lines=2,
1182
+ max_lines=4,
1183
+ show_label=False,
1184
+ )
1185
+
1186
+ # Buttons Row - properly aligned
1187
+ with gr.Row():
1188
+ run_btn = gr.Button(
1189
+ "🚀 Run Analysis",
1190
+ variant="primary",
1191
+ size="lg",
1192
+ scale=3,
1193
+ )
1194
+ clear_btn = gr.Button(
1195
+ "🗑️ Clear",
1196
+ variant="secondary",
1197
+ size="lg",
1198
+ scale=1,
1199
+ )
1200
+
1201
+ # Example Queries - as clickable chips
1202
+ gr.HTML("""
1203
+ <div style="margin-top: 16px; padding-top: 12px; border-top: 1px solid var(--border-color);">
1204
+ <span style="font-size: 0.8rem; font-weight: 600; color: var(--text-muted); margin-bottom: 8px; display: block;">💡 Quick Examples</span>
1205
+ </div>
1206
+ """)
1207
+ gr.Examples(
1208
+ examples=[
1209
+ ["Show distribution of transaction amounts"],
1210
+ ["Which category has highest sales?"],
1211
+ ["Find correlations between columns"],
1212
+ ["Create summary statistics report"],
1213
+ ],
1214
+ inputs=query_input,
1215
+ label="",
1216
+ examples_per_page=4,
1217
+ )
1218
+
1219
+ # Status Section
1220
+ gr.HTML("""
1221
+ <div class="section-header" style="margin-top: 20px;">
1222
+ <div class="section-icon">📊</div>
1223
+ <h3 class="section-title">Status</h3>
1224
+ </div>
1225
+ """)
1226
+
1227
+ with gr.Row():
1228
+ status_display = gr.Textbox(
1229
+ label="Status",
1230
+ value="⏳ Ready",
1231
+ interactive=False,
1232
+ scale=1,
1233
+ elem_classes="status-box",
1234
+ lines=1,
1235
+ max_lines=1,
1236
+ )
1237
+ current_step = gr.Textbox(
1238
+ label="Current Step",
1239
+ value="Waiting for query...",
1240
+ interactive=False,
1241
+ scale=2,
1242
+ elem_classes="step-box",
1243
+ lines=1,
1244
+ max_lines=1,
1245
+ )
1246
+
1247
+ # Results Tabs
1248
+ with gr.Tabs():
1249
+ with gr.TabItem("💻 Code", id=0):
1250
+ code_output = gr.Code(
1251
+ label="",
1252
+ language="python",
1253
+ lines=16,
1254
+ interactive=False,
1255
+ show_label=False,
1256
+ )
1257
+
1258
+ with gr.TabItem("📊 Output", id=1):
1259
+ result_output = gr.Textbox(
1260
+ label="",
1261
+ lines=14,
1262
+ interactive=False,
1263
+ show_label=False,
1264
+ )
1265
+
1266
+ with gr.TabItem("🔄 Workflow", id=2):
1267
+ gr.HTML("""
1268
+ <div class="workflow-container">
1269
+ <div class="workflow-step">
1270
+ <div class="step-number">1</div>
1271
+ <div class="step-content">
1272
+ <div class="step-title">Analyzer</div>
1273
+ <div class="step-desc">Examines data structure</div>
1274
+ </div>
1275
+ </div>
1276
+ <div class="workflow-step">
1277
+ <div class="step-number">2</div>
1278
+ <div class="step-content">
1279
+ <div class="step-title">Planner</div>
1280
+ <div class="step-desc">Creates execution plan</div>
1281
+ </div>
1282
+ </div>
1283
+ <div class="workflow-step">
1284
+ <div class="step-number">3</div>
1285
+ <div class="step-content">
1286
+ <div class="step-title">Coder</div>
1287
+ <div class="step-desc">Generates Python code</div>
1288
+ </div>
1289
+ </div>
1290
+ <div class="workflow-step">
1291
+ <div class="step-number">4</div>
1292
+ <div class="step-content">
1293
+ <div class="step-title">Verifier</div>
1294
+ <div class="step-desc">Validates solution</div>
1295
+ </div>
1296
+ </div>
1297
+ <div class="workflow-step">
1298
+ <div class="step-number">5</div>
1299
+ <div class="step-content">
1300
+ <div class="step-title">Router</div>
1301
+ <div class="step-desc">Decides next step</div>
1302
+ </div>
1303
+ </div>
1304
+ <div class="workflow-step">
1305
+ <div class="step-number">6</div>
1306
+ <div class="step-content">
1307
+ <div class="step-title">Finalyzer</div>
1308
+ <div class="step-desc">Delivers final result</div>
1309
+ </div>
1310
+ </div>
1311
+ </div>
1312
+ """)
1313
+
1314
+ with gr.TabItem("ℹ️ About", id=3):
1315
+ gr.Markdown("""
1316
+ ## About DS-STAR
1317
+
1318
+ **DS-STAR** (Data Science - Structured Task Analysis and Resolution) is a multi-agent system for automating data science tasks.
1319
+
1320
+ ### ✨ Features
1321
+
1322
+ - 🤖 **Multi-Agent** — Specialized agents for analysis, planning, coding & verification
1323
+ - 🔄 **Iterative** — Automatically refines solutions
1324
+ - 🔙 **Backtracking** — Smart rollback when needed
1325
+ - 💻 **Code Gen** — Produces clean Python code
1326
+
1327
+ ### 🔌 Providers
1328
+
1329
+ **Google Gemini** • **OpenAI GPT** • **Anthropic Claude** • **Custom API**
1330
+
1331
+ ---
1332
+
1333
+ Built for the **HuggingFace MCP Hackathon** • [GitHub](https://github.com/Anurag-Deo/DS-STAR)
1334
+ """)
1335
+
1336
+ # Event Handlers
1337
+ provider.change(
1338
+ fn=update_model_dropdown,
1339
+ inputs=[provider, api_key, base_url],
1340
+ outputs=[model],
1341
+ )
1342
+
1343
+ provider.change(
1344
+ fn=update_base_url_visibility,
1345
+ inputs=[provider],
1346
+ outputs=[base_url],
1347
+ )
1348
+
1349
+ provider.change(
1350
+ fn=lambda p, k, b: validate_api_key(p, k, b)[1],
1351
+ inputs=[provider, api_key, base_url],
1352
+ outputs=[api_status],
1353
+ )
1354
+
1355
+ api_key.change(
1356
+ fn=lambda p, k, b: validate_api_key(p, k, b)[1],
1357
+ inputs=[provider, api_key, base_url],
1358
+ outputs=[api_status],
1359
+ )
1360
+
1361
+ base_url.change(
1362
+ fn=lambda p, k, b: validate_api_key(p, k, b)[1],
1363
+ inputs=[provider, api_key, base_url],
1364
+ outputs=[api_status],
1365
+ )
1366
+
1367
+ refresh_models_btn.click(
1368
+ fn=refresh_models,
1369
+ inputs=[provider, api_key, base_url],
1370
+ outputs=[model, api_status],
1371
+ )
1372
+
1373
+ file_upload.change(
1374
+ fn=copy_uploaded_files, inputs=[file_upload], outputs=[upload_status]
1375
+ )
1376
+
1377
+ refresh_btn.click(fn=list_data_files, outputs=[upload_status])
1378
+
1379
+ run_btn.click(
1380
+ fn=run_ds_star_workflow,
1381
+ inputs=[
1382
+ query_input,
1383
+ provider,
1384
+ model,
1385
+ api_key,
1386
+ base_url,
1387
+ max_iterations,
1388
+ temperature,
1389
+ ],
1390
+ outputs=[status_display, current_step, code_output, result_output],
1391
+ )
1392
+
1393
+ clear_btn.click(
1394
+ fn=lambda: ("⏳ Ready", "Waiting for query...", "", ""),
1395
+ outputs=[status_display, current_step, code_output, result_output],
1396
+ )
1397
+
1398
+ return demo
1399
+
1400
+
1401
+ # ==================== MAIN ====================
1402
+
1403
+ if __name__ == "__main__":
1404
+ demo = create_gradio_app()
1405
+ theme = create_ds_star_theme()
1406
+ demo.launch(share=False, show_error=True, theme=theme, css=CUSTOM_CSS)
data/cards_data.csv ADDED
The diff for this file is too large to render. See raw diff
 
main.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DS-STAR: Multi-Agent System for Data Science Tasks
3
+
4
+ This is the main entry point for the refactored DS-STAR system.
5
+ All agents are modularized and can be tested independently.
6
+
7
+ Usage:
8
+ python main_new.py
9
+
10
+ Or customize:
11
+ from src.graph import run_ds_star
12
+ from src.config import get_llm
13
+
14
+ llm = get_llm(provider="google", model="gemini-1.5-flash")
15
+ result = run_ds_star("Your question here", llm, max_iterations=20)
16
+ """
17
+
18
+ import sys
19
+
20
+ from src.config import DEFAULT_CONFIG, get_llm
21
+ from src.graph import run_ds_star
22
+
23
+
24
+ def main():
25
+ """
26
+ Main execution function for DS-STAR.
27
+ """
28
+ # Configuration
29
+ query = "What percentage of transactions use credit cards?"
30
+ max_iterations = DEFAULT_CONFIG["max_iterations"]
31
+ provider = DEFAULT_CONFIG["provider"]
32
+ model = DEFAULT_CONFIG["model"]
33
+
34
+ print("Initializing DS-STAR Multi-Agent System...")
35
+ print(f"Provider: {provider}")
36
+ print(f"Model: {model}")
37
+ print()
38
+
39
+ try:
40
+ # Initialize LLM
41
+ llm = get_llm(
42
+ provider=provider, model=model, temperature=DEFAULT_CONFIG["temperature"]
43
+ )
44
+
45
+ # Run DS-STAR workflow
46
+ final_state = run_ds_star(
47
+ query=query,
48
+ llm=llm,
49
+ max_iterations=max_iterations,
50
+ thread_id="ds-star-main-session",
51
+ )
52
+
53
+ if final_state:
54
+ print("\n✅ Workflow completed successfully!")
55
+ return 0
56
+ else:
57
+ print("\n❌ Workflow failed!")
58
+ return 1
59
+
60
+ except Exception as e:
61
+ print(f"\n❌ Fatal error: {str(e)}")
62
+ import traceback
63
+
64
+ traceback.print_exc()
65
+ return 1
66
+
67
+
68
+ if __name__ == "__main__":
69
+ sys.exit(main())
pyproject.toml ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "ds-star"
3
+ version = "0.1.0"
4
+ description = "DS-STAR: Multi-Agent System for Automated Data Science Tasks"
5
+ readme = "README.md"
6
+ requires-python = ">=3.10"
7
+ dependencies = [
8
+ "python-dotenv>=1.0.0",
9
+ "langchain>=0.3.0",
10
+ "langchain-anthropic>=0.3.0",
11
+ "langchain-core>=0.3.0",
12
+ "langchain-google-genai>=2.0.0",
13
+ "langchain-openai>=0.3.0",
14
+ "langgraph>=0.2.0",
15
+ "pandas>=2.0.0",
16
+ "gradio>=5.0.0",
17
+ ]
18
+
19
+ [project.scripts]
20
+ ds-star = "app:create_gradio_app"
21
+
22
+ [project.optional-dependencies]
23
+ dev = [
24
+ "pytest>=7.0.0",
25
+ "black>=24.0.0",
26
+ "ruff>=0.5.0",
27
+ ]
requirements.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DS-STAR Dependencies
2
+ # For Hugging Face Spaces deployment
3
+
4
+ # Core dependencies
5
+ python-dotenv>=1.0.0
6
+ langchain>=0.3.0
7
+ langchain-anthropic>=0.3.0
8
+ langchain-core>=0.3.0
9
+ langchain-google-genai>=2.0.0
10
+ langchain-openai>=0.3.0
11
+ langgraph>=0.2.0
12
+ pandas>=2.0.0
13
+
14
+ # Web interface
15
+ gradio>=5.0.0
16
+
17
+ # Additional utilities
18
+ numpy>=1.20.0
src/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """DS-STAR multi-agent system for data science tasks."""
src/agents/__init__.py ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Agent modules for DS-STAR system."""
2
+
3
+ from .analyzer_agent import analyzer_node, test_analyzer
4
+ from .coder_agent import coder_node, test_coder
5
+ from .finalyzer_agent import finalyzer_node, test_finalyzer
6
+ from .planner_agent import planner_node, test_planner
7
+ from .router_agent import backtrack_node, router_node, test_router
8
+ from .verifier_agent import test_verifier, verifier_node
9
+
10
+ __all__ = [
11
+ "analyzer_node",
12
+ "planner_node",
13
+ "coder_node",
14
+ "verifier_node",
15
+ "router_node",
16
+ "backtrack_node",
17
+ "finalyzer_node",
18
+ "test_analyzer",
19
+ "test_planner",
20
+ "test_coder",
21
+ "test_verifier",
22
+ "test_router",
23
+ "test_finalyzer",
24
+ ]
src/agents/analyzer_agent.py ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Analyzer Agent: Analyzes data files and generates descriptions.
3
+
4
+ This agent runs once at the beginning to understand available data.
5
+ """
6
+
7
+ import os
8
+ from pathlib import Path
9
+
10
+ from langchain_core.messages import AIMessage
11
+
12
+ from ..utils.code_execution import execute_with_debug
13
+ from ..utils.formatters import extract_code, gemini_text
14
+ from ..utils.state import DSStarState
15
+
16
+
17
+ def analyzer_node(state: DSStarState) -> dict:
18
+ """
19
+ Analyzer Agent Node: Analyzes all data files in the data/ directory.
20
+
21
+ For each file, generates and executes Python code to:
22
+ - Load the file
23
+ - Print structure, types, and sample data
24
+ - Capture essential information
25
+
26
+ Args:
27
+ state: Current DSStarState
28
+
29
+ Returns:
30
+ Dictionary with updated state fields:
31
+ - data_descriptions: Dict mapping filename to analysis result
32
+ - messages: Agent communication messages
33
+ - next: Next node to visit ("planner" or "__end__")
34
+ """
35
+ print("=" * 60)
36
+ print("DATA ANALYZER AGENT STARTING...")
37
+ print("=" * 60)
38
+
39
+ data_dir = "data/"
40
+ descriptions = {}
41
+
42
+ # Check if data directory exists
43
+ if not os.path.exists(data_dir):
44
+ print(f"Error: {data_dir} directory not found")
45
+ return {
46
+ "data_descriptions": {"error": "Data directory not found"},
47
+ "messages": [AIMessage(content="Error: data/ directory not found")],
48
+ "next": "__end__",
49
+ }
50
+
51
+ # Get list of files
52
+ files = [
53
+ f for f in os.listdir(data_dir) if os.path.isfile(os.path.join(data_dir, f))
54
+ ]
55
+
56
+ if not files:
57
+ print(f"Error: No files found in {data_dir}")
58
+ return {
59
+ "data_descriptions": {"error": "No data files found"},
60
+ "messages": [AIMessage(content="Error: No files in data/ directory")],
61
+ "next": "__end__",
62
+ }
63
+
64
+ print(f"Found {len(files)} files to analyze")
65
+
66
+ # Analyze each file
67
+ for filename in files:
68
+ filepath = os.path.join(data_dir, filename)
69
+ file_ext = Path(filepath).suffix.lower()
70
+
71
+ print(f"\nAnalyzing: {filename}")
72
+
73
+ # Generate analysis script
74
+ analysis_prompt = f"""Generate a Python script to analyze the file: {filepath}
75
+
76
+ File type: {file_ext}
77
+
78
+ Requirements:
79
+ - Load the file using appropriate method for {file_ext} format
80
+ - Print essential information:
81
+ * Data structure and types
82
+ * Column names (for structured data like CSV, Excel)
83
+ * First 3-5 rows/examples
84
+ * Shape/size information
85
+ - Handle common formats: CSV, JSON, Excel, TXT, MD
86
+ - Use pandas for structured data
87
+ - No try-except blocks
88
+ - All files are in 'data/' directory
89
+ - Print output clearly
90
+
91
+ Provide ONLY the Python code in a markdown code block."""
92
+
93
+ try:
94
+ # Get LLM response
95
+ response = state["llm"].invoke(analysis_prompt)
96
+
97
+ # Handle different response formats (Gemini vs OpenAI)
98
+ if hasattr(response, "content") and isinstance(response.content, list):
99
+ # Gemini format
100
+ response_text = gemini_text(response)
101
+ elif hasattr(response, "content"):
102
+ response_text = response.content
103
+ else:
104
+ response_text = str(response)
105
+
106
+ code = extract_code(response_text)
107
+
108
+ # Execute with debugging
109
+ result = execute_with_debug(code, state["llm"], is_analysis=True)
110
+
111
+ descriptions[filename] = result
112
+ print(f"✓ Successfully analyzed {filename}")
113
+
114
+ except Exception as e:
115
+ descriptions[filename] = f"Error analyzing file: {str(e)}"
116
+ print(f"✗ Failed to analyze {filename}: {str(e)}")
117
+
118
+ print("\n" + "=" * 60)
119
+ print(f"ANALYSIS COMPLETE: {len(files)} files processed")
120
+ print("=" * 60)
121
+
122
+ return {
123
+ "data_descriptions": descriptions,
124
+ "messages": [AIMessage(content=f"Analyzed {len(files)} data files")],
125
+ "next": "planner",
126
+ }
127
+
128
+
129
+ # Standalone test function
130
+ def test_analyzer(llm, data_dir: str = "data/"):
131
+ """
132
+ Test the analyzer agent independently.
133
+
134
+ Args:
135
+ llm: LLM instance
136
+ data_dir: Directory containing data files
137
+
138
+ Returns:
139
+ Dictionary with analysis results
140
+ """
141
+ # Create minimal test state
142
+ test_state = {
143
+ "llm": llm,
144
+ "query": "Test query",
145
+ "data_descriptions": {},
146
+ "plan": [],
147
+ "current_code": "",
148
+ "execution_result": "",
149
+ "is_sufficient": False,
150
+ "router_decision": "",
151
+ "iteration": 0,
152
+ "max_iterations": 20,
153
+ "messages": [],
154
+ "next": "analyzer",
155
+ }
156
+
157
+ result = analyzer_node(test_state)
158
+
159
+ print("\n" + "=" * 60)
160
+ print("ANALYZER TEST RESULTS")
161
+ print("=" * 60)
162
+ for filename, description in result["data_descriptions"].items():
163
+ print(f"\n{filename}:")
164
+ print("-" * 60)
165
+ print(description)
166
+
167
+ return result
src/agents/coder_agent.py ADDED
@@ -0,0 +1,174 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Coder Agent: Implements the plan as executable Python code.
3
+
4
+ This agent generates Python code that implements all steps in the current plan.
5
+ """
6
+
7
+ from langchain_core.messages import AIMessage
8
+
9
+ from ..utils.code_execution import execute_with_debug
10
+ from ..utils.formatters import extract_code, format_data_descriptions, format_plan
11
+ from ..utils.state import DSStarState
12
+
13
+
14
+ def coder_node(state: DSStarState) -> dict:
15
+ """
16
+ Coder Agent Node: Generates and executes Python code for the plan.
17
+
18
+ On first call: Generates code implementing all plan steps
19
+ On subsequent calls: Updates code to include new plan steps
20
+
21
+ Args:
22
+ state: Current DSStarState
23
+
24
+ Returns:
25
+ Dictionary with updated state fields:
26
+ - current_code: Generated Python code
27
+ - execution_result: Output from code execution
28
+ - messages: Agent communication messages
29
+ - next: Next node to visit ("verifier")
30
+ """
31
+ print("=" * 60)
32
+ print("CODER AGENT STARTING...")
33
+ print("=" * 60)
34
+
35
+ data_context = format_data_descriptions(state["data_descriptions"])
36
+ plan_text = format_plan(state["plan"])
37
+
38
+ is_initial = state["current_code"] == ""
39
+
40
+ if is_initial:
41
+ print("Generating INITIAL code implementation...")
42
+ prompt = f"""You are an expert Python developer for data science.
43
+
44
+ Available Data Files:
45
+ {data_context}
46
+
47
+ Plan to Implement:
48
+ {plan_text}
49
+
50
+ Task: Write a Python script that implements ALL steps in the plan.
51
+
52
+ Requirements:
53
+ - Use pandas for data manipulation
54
+ - All files are in 'data/' directory
55
+ - Print intermediate results for each step
56
+ - No try-except blocks
57
+ - Clean, readable code
58
+
59
+ Provide ONLY the Python code in a markdown code block."""
60
+ else:
61
+ print(f"Updating code to implement {len(state['plan'])} steps...")
62
+ prompt = f"""You are an expert Python developer for data science.
63
+
64
+ Available Data Files:
65
+ {data_context}
66
+
67
+ Complete Plan:
68
+ {plan_text}
69
+
70
+ Previous Code:
71
+ {state["current_code"]}
72
+
73
+ Task: Update the code to implement the COMPLETE current plan.
74
+ Build upon the previous code, extending it to include all plan steps.
75
+
76
+ Requirements:
77
+ - Use pandas for data manipulation
78
+ - All files are in 'data/' directory
79
+ - Print intermediate and final results
80
+ - No try-except blocks
81
+ - Clean, readable code
82
+
83
+ Provide ONLY the updated Python code in a markdown code block."""
84
+
85
+ try:
86
+ # Get LLM response
87
+ response = state["llm"].invoke(prompt)
88
+
89
+ # Handle different response formats
90
+ if hasattr(response, "content") and isinstance(response.content, list):
91
+ from ..utils.formatters import gemini_text
92
+
93
+ response_text = gemini_text(response)
94
+ elif hasattr(response, "content"):
95
+ response_text = response.content
96
+ else:
97
+ response_text = str(response)
98
+
99
+ code = extract_code(response_text)
100
+
101
+ print("\nGenerated Code:")
102
+ print("-" * 60)
103
+ print(code[:200] + "..." if len(code) > 200 else code)
104
+ print("-" * 60)
105
+
106
+ print("\nExecuting code...")
107
+
108
+ # Execute with debugging
109
+ result = execute_with_debug(
110
+ code, state["llm"], is_analysis=False, data_context=data_context
111
+ )
112
+
113
+ print("\nExecution Result:")
114
+ print("-" * 60)
115
+ print(result[:200] + "..." if len(result) > 200 else result)
116
+ print("-" * 60)
117
+ print("=" * 60)
118
+
119
+ return {
120
+ "current_code": code,
121
+ "execution_result": result,
122
+ "messages": [AIMessage(content="Code executed")],
123
+ "next": "verifier",
124
+ }
125
+
126
+ except Exception as e:
127
+ print(f"\n✗ Coder error: {str(e)}")
128
+ return {
129
+ "messages": [AIMessage(content=f"Coder error: {str(e)}")],
130
+ "next": "__end__",
131
+ }
132
+
133
+
134
+ # Standalone test function
135
+ def test_coder(llm, query: str, data_descriptions: dict, plan: list):
136
+ """
137
+ Test the coder agent independently.
138
+
139
+ Args:
140
+ llm: LLM instance
141
+ query: User query
142
+ data_descriptions: Dict of filename -> description
143
+ plan: List of plan steps to implement
144
+
145
+ Returns:
146
+ Dictionary with coder results
147
+ """
148
+ # Create minimal test state
149
+ test_state = {
150
+ "llm": llm,
151
+ "query": query,
152
+ "data_descriptions": data_descriptions,
153
+ "plan": plan,
154
+ "current_code": "",
155
+ "execution_result": "",
156
+ "is_sufficient": False,
157
+ "router_decision": "",
158
+ "iteration": 0,
159
+ "max_iterations": 20,
160
+ "messages": [],
161
+ "next": "coder",
162
+ }
163
+
164
+ result = coder_node(test_state)
165
+
166
+ print("\n" + "=" * 60)
167
+ print("CODER TEST RESULTS")
168
+ print("=" * 60)
169
+ print("Code:")
170
+ print(result.get("current_code", "No code generated"))
171
+ print("\nExecution Result:")
172
+ print(result.get("execution_result", "No result"))
173
+
174
+ return result
src/agents/finalyzer_agent.py ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Finalyzer Agent: Creates final polished solution with clear output.
3
+
4
+ This agent runs when the verifier confirms the plan is sufficient.
5
+ It generates a final version of the code with improved formatting and output.
6
+ """
7
+
8
+ from langchain_core.messages import AIMessage
9
+
10
+ from ..utils.code_execution import execute_code_safely
11
+ from ..utils.formatters import extract_code, format_data_descriptions, gemini_text
12
+ from ..utils.state import DSStarState
13
+
14
+
15
+ def finalyzer_node(state: DSStarState) -> dict:
16
+ """
17
+ Finalyzer Agent Node: Creates final polished solution.
18
+
19
+ Takes the working code and creates a final version with:
20
+ - Clear answer to the original question
21
+ - Proper output formatting
22
+ - Self-contained executable code
23
+
24
+ Args:
25
+ state: Current DSStarState
26
+
27
+ Returns:
28
+ Dictionary with updated state fields:
29
+ - current_code: Final polished code
30
+ - execution_result: Final execution output
31
+ - messages: Agent communication messages
32
+ - next: "__end__" (workflow complete)
33
+ """
34
+ print("=" * 60)
35
+ print("FINALYZER AGENT STARTING...")
36
+ print("=" * 60)
37
+
38
+ data_context = format_data_descriptions(state["data_descriptions"])
39
+
40
+ prompt = f"""You are an expert data analyst creating final solutions.
41
+
42
+ Original Question: {state["query"]}
43
+
44
+ Available Data:
45
+ {data_context}
46
+
47
+ Working Code:
48
+ {state["current_code"]}
49
+
50
+ Execution Result:
51
+ {state["execution_result"]}
52
+
53
+ Task: Create a final version of the code that:
54
+ 1. Clearly prints the answer to the question
55
+ 2. Includes proper formatting of the output
56
+ 3. Is self-contained and executable
57
+ 4. Has clear print statements
58
+
59
+ Provide ONLY the final Python code in a markdown code block."""
60
+
61
+ try:
62
+ # Get LLM response
63
+ response = state["llm"].invoke(prompt)
64
+
65
+ # Handle different response formats
66
+ if hasattr(response, "content") and isinstance(response.content, list):
67
+ response_text = gemini_text(response)
68
+ elif hasattr(response, "content"):
69
+ response_text = response.content
70
+ else:
71
+ response_text = str(response)
72
+
73
+ final_code = extract_code(response_text)
74
+
75
+ print("\nFinal Code Generated:")
76
+ print("-" * 60)
77
+ print(final_code[:300] + "..." if len(final_code) > 300 else final_code)
78
+ print("-" * 60)
79
+
80
+ # Execute final code
81
+ print("\nExecuting final code...")
82
+ success, stdout, stderr = execute_code_safely(final_code)
83
+
84
+ if success:
85
+ final_result = stdout
86
+ print("\n✓ Final execution successful")
87
+ else:
88
+ # If final execution fails, use previous result
89
+ print("\n⚠ Final execution failed, using previous result")
90
+ final_result = state["execution_result"]
91
+
92
+ print("\nFinal Result:")
93
+ print("-" * 60)
94
+ print(final_result[:300] + "..." if len(final_result) > 300 else final_result)
95
+ print("-" * 60)
96
+ print("=" * 60)
97
+ print("SOLUTION COMPLETE ✓")
98
+ print("=" * 60)
99
+
100
+ return {
101
+ "current_code": final_code,
102
+ "execution_result": final_result,
103
+ "messages": [AIMessage(content="Solution finalized")],
104
+ "next": "__end__",
105
+ }
106
+
107
+ except Exception as e:
108
+ # On error, return current state
109
+ print(f"\n✗ Finalyzer error: {str(e)}")
110
+ print("Using current solution as final")
111
+ return {
112
+ "messages": [
113
+ AIMessage(content=f"Finalyzer error: {str(e)}, using current solution")
114
+ ],
115
+ "next": "__end__",
116
+ }
117
+
118
+
119
+ # Standalone test function
120
+ def test_finalyzer(
121
+ llm, query: str, data_descriptions: dict, current_code: str, execution_result: str
122
+ ):
123
+ """
124
+ Test the finalyzer agent independently.
125
+
126
+ Args:
127
+ llm: LLM instance
128
+ query: User query
129
+ data_descriptions: Dict of filename -> description
130
+ current_code: Working code to finalize
131
+ execution_result: Current execution result
132
+
133
+ Returns:
134
+ Dictionary with finalyzer results
135
+ """
136
+ # Create minimal test state
137
+ test_state = {
138
+ "llm": llm,
139
+ "query": query,
140
+ "data_descriptions": data_descriptions,
141
+ "plan": [],
142
+ "current_code": current_code,
143
+ "execution_result": execution_result,
144
+ "is_sufficient": True,
145
+ "router_decision": "",
146
+ "iteration": 0,
147
+ "max_iterations": 20,
148
+ "messages": [],
149
+ "next": "finalyzer",
150
+ }
151
+
152
+ result = finalyzer_node(test_state)
153
+
154
+ print("\n" + "=" * 60)
155
+ print("FINALYZER TEST RESULTS")
156
+ print("=" * 60)
157
+ print("Final Code:")
158
+ print(result.get("current_code", "No code"))
159
+ print("\nFinal Result:")
160
+ print(result.get("execution_result", "No result"))
161
+
162
+ return result
src/agents/planner_agent.py ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Planner Agent: Generates next plan step to answer the query.
3
+
4
+ This agent generates ONE step at a time based on:
5
+ - The original query
6
+ - Available data files
7
+ - Previously completed steps (if any)
8
+ """
9
+
10
+ from langchain_core.messages import AIMessage
11
+
12
+ from ..utils.formatters import format_data_descriptions, format_plan, gemini_text
13
+ from ..utils.state import DSStarState, PlanStep
14
+
15
+
16
+ def planner_node(state: DSStarState) -> dict:
17
+ """
18
+ Planner Agent Node: Generates the next step in the plan.
19
+
20
+ On first call: Generates initial step to start answering the query
21
+ On subsequent calls: Generates next step based on progress so far
22
+
23
+ Args:
24
+ state: Current DSStarState
25
+
26
+ Returns:
27
+ Dictionary with updated state fields:
28
+ - plan: Updated plan with new step appended
29
+ - messages: Agent communication messages
30
+ - next: Next node to visit ("coder")
31
+ """
32
+ print("=" * 60)
33
+ print("PLANNER AGENT STARTING...")
34
+ print("=" * 60)
35
+
36
+ is_initial = len(state["plan"]) == 0
37
+ data_context = format_data_descriptions(state["data_descriptions"])
38
+
39
+ if is_initial:
40
+ print("Generating INITIAL plan step...")
41
+ prompt = f"""You are an expert data analyst.
42
+
43
+ Question to answer: {state["query"]}
44
+
45
+ Available Data Files:
46
+ {data_context}
47
+
48
+ Task: Generate list of simple, executable steps to start answering this question.
49
+ Examples of good steps:
50
+ - "Load the transactions.csv file"
51
+ - "Read and explore the sales data"
52
+
53
+ Provide ONLY the step description (one sentence) in one line in bullet points, no explanation."""
54
+ else:
55
+ print(f"Generating NEXT step (current plan has {len(state['plan'])} steps)...")
56
+ plan_text = format_plan(state["plan"])
57
+
58
+ prompt = f"""You are an expert data analyst.
59
+
60
+ Question to answer: {state["query"]}
61
+
62
+ Available Data Files:
63
+ {data_context}
64
+
65
+ Current Plan (completed steps):
66
+ {plan_text}
67
+
68
+ Last Execution Result:
69
+ {state["execution_result"][:500]}...
70
+
71
+ Task: Suggest the NEXT step to progress toward answering the question.
72
+ Make it simple and executable (one clear action).
73
+
74
+ Provide ONLY the next step description (one sentence), no explanation."""
75
+
76
+ try:
77
+ # Get LLM response
78
+ response = state["llm"].invoke(prompt)
79
+
80
+ # Handle different response formats
81
+ if hasattr(response, "content") and isinstance(response.content, list):
82
+ response_text = gemini_text(response)
83
+ elif hasattr(response, "content"):
84
+ response_text = response.content
85
+ else:
86
+ response_text = str(response)
87
+
88
+ # Create new step
89
+ new_step = PlanStep(
90
+ step_number=len(state["plan"]), description=response_text.strip()
91
+ )
92
+
93
+ # Add new step to existing plan
94
+ updated_plan = state["plan"] + [new_step]
95
+
96
+ print(
97
+ f"\n✓ Generated step {new_step['step_number'] + 1}: {new_step['description']}"
98
+ )
99
+ print("=" * 60)
100
+
101
+ return {
102
+ "plan": updated_plan,
103
+ "messages": [
104
+ AIMessage(content=f"Added step {new_step['step_number'] + 1}")
105
+ ],
106
+ "next": "coder",
107
+ }
108
+
109
+ except Exception as e:
110
+ print(f"✗ Planner error: {str(e)}")
111
+ return {
112
+ "messages": [AIMessage(content=f"Planner error: {str(e)}")],
113
+ "next": "__end__",
114
+ }
115
+
116
+
117
+ # Standalone test function
118
+ def test_planner(llm, query: str, data_descriptions: dict, existing_plan: list = None):
119
+ """
120
+ Test the planner agent independently.
121
+
122
+ Args:
123
+ llm: LLM instance
124
+ query: User query
125
+ data_descriptions: Dict of filename -> description
126
+ existing_plan: Optional existing plan steps
127
+
128
+ Returns:
129
+ Dictionary with planner results
130
+ """
131
+ # Create minimal test state
132
+ test_state = {
133
+ "llm": llm,
134
+ "query": query,
135
+ "data_descriptions": data_descriptions,
136
+ "plan": existing_plan or [],
137
+ "current_code": "",
138
+ "execution_result": "",
139
+ "is_sufficient": False,
140
+ "router_decision": "",
141
+ "iteration": 0,
142
+ "max_iterations": 20,
143
+ "messages": [],
144
+ "next": "planner",
145
+ }
146
+
147
+ result = planner_node(test_state)
148
+
149
+ print("\n" + "=" * 60)
150
+ print("PLANNER TEST RESULTS")
151
+ print("=" * 60)
152
+ print(f"Updated Plan ({len(result.get('plan', []))} steps):")
153
+ print(format_plan(result.get("plan", [])))
154
+
155
+ return result
src/agents/router_agent.py ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Router Agent: Decides how to improve an insufficient plan.
3
+
4
+ When the verifier determines the plan is insufficient, the router decides:
5
+ - "Add Step": Add a new step to the plan
6
+ - "Step N": Backtrack to step N and fix it
7
+ """
8
+
9
+ import re
10
+
11
+ from langchain_core.messages import AIMessage
12
+
13
+ from ..utils.formatters import format_data_descriptions, format_plan, gemini_text
14
+ from ..utils.state import DSStarState
15
+
16
+
17
+ def router_node(state: DSStarState) -> dict:
18
+ """
19
+ Router Agent Node: Decides how to improve the plan.
20
+
21
+ Analyzes the current situation and determines whether to:
22
+ 1. Add a new step to the plan
23
+ 2. Backtrack and fix an existing step
24
+
25
+ Args:
26
+ state: Current DSStarState
27
+
28
+ Returns:
29
+ Dictionary with updated state fields:
30
+ - router_decision: "Add Step" or "Step N"
31
+ - iteration: Incremented iteration count
32
+ - messages: Agent communication messages
33
+ - next: "planner" (add step) or "backtrack" (fix step)
34
+ """
35
+ print("=" * 60)
36
+ print("ROUTER AGENT STARTING...")
37
+ print("=" * 60)
38
+
39
+ data_context = format_data_descriptions(state["data_descriptions"])
40
+ plan_text = format_plan(state["plan"])
41
+
42
+ prompt = f"""You are an expert data analyst router.
43
+
44
+ The current plan is INSUFFICIENT to answer the question.
45
+
46
+ Original Question: {state["query"]}
47
+
48
+ Available Data:
49
+ {data_context}
50
+
51
+ Current Plan:
52
+ {plan_text}
53
+
54
+ Execution Result:
55
+ {state["execution_result"][:500]}
56
+
57
+ Task: Decide how to improve the plan:
58
+ 1. If a current step is WRONG or needs fixing: Answer "Step N" (where N is the step number, e.g., "Step 2")
59
+ 2. If we need to ADD a NEW step: Answer "Add Step"
60
+
61
+ Answer with ONLY: "Step 1", "Step 2", etc. OR "Add Step"
62
+ No explanation needed."""
63
+
64
+ try:
65
+ # Get LLM response
66
+ response = state["llm"].invoke(prompt)
67
+
68
+ # Handle different response formats
69
+ if hasattr(response, "content") and isinstance(response.content, list):
70
+ response_text = gemini_text(response)
71
+ elif hasattr(response, "content"):
72
+ response_text = response.content
73
+ else:
74
+ response_text = str(response)
75
+
76
+ # Parse decision
77
+ response_lower = response_text.strip().lower()
78
+ if "add step" in response_lower:
79
+ decision = "Add Step"
80
+ next_node = "planner"
81
+ else:
82
+ # Try to extract step number
83
+ match = re.search(r"step\s+(\d+)", response_lower)
84
+ if match:
85
+ decision = f"Step {match.group(1)}"
86
+ next_node = "backtrack"
87
+ else:
88
+ # Default to adding new step
89
+ decision = "Add Step"
90
+ next_node = "planner"
91
+
92
+ print(f"\nRouter Decision: {decision}")
93
+ print(
94
+ f"Next Action: {'Backtrack' if next_node == 'backtrack' else 'Add New Step'}"
95
+ )
96
+ print("=" * 60)
97
+
98
+ return {
99
+ "router_decision": decision,
100
+ "messages": [AIMessage(content=f"Router: {decision}")],
101
+ "iteration": state["iteration"] + 1,
102
+ "next": next_node,
103
+ }
104
+
105
+ except Exception as e:
106
+ # On error, default to adding new step
107
+ print(f"\n✗ Router error: {str(e)}")
108
+ print("Defaulting to 'Add Step'")
109
+ return {
110
+ "router_decision": "Add Step",
111
+ "messages": [AIMessage(content=f"Router error, adding step: {str(e)}")],
112
+ "iteration": state["iteration"] + 1,
113
+ "next": "planner",
114
+ }
115
+
116
+
117
+ def backtrack_node(state: DSStarState) -> dict:
118
+ """
119
+ Backtrack Node: Truncates plan to remove incorrect steps.
120
+
121
+ When router identifies a wrong step, this node:
122
+ 1. Parses the step number from router_decision
123
+ 2. Truncates the plan to remove that step and all subsequent steps
124
+ 3. Routes back to planner to regenerate from that point
125
+
126
+ Args:
127
+ state: Current DSStarState
128
+
129
+ Returns:
130
+ Dictionary with updated state fields:
131
+ - plan: Truncated plan
132
+ - messages: Agent communication messages
133
+ - next: "planner" to regenerate from truncation point
134
+ """
135
+ print("=" * 60)
136
+ print("BACKTRACK NODE ACTIVATING...")
137
+ print("=" * 60)
138
+
139
+ try:
140
+ # Extract step number from router decision
141
+ match = re.search(r"step\s+(\d+)", state["router_decision"].lower())
142
+ if match:
143
+ step_num = int(match.group(1))
144
+ else:
145
+ # If parsing fails, just add new step
146
+ print("Failed to parse step number, adding new step instead")
147
+ return {
148
+ "messages": [
149
+ AIMessage(content="Backtrack parsing failed, adding new step")
150
+ ],
151
+ "next": "planner",
152
+ }
153
+
154
+ # Truncate plan to steps before the wrong one
155
+ # Keep steps 0 to (step_num - 2), which are steps 1 to (step_num - 1) in human counting
156
+ truncated_plan = state["plan"][: step_num - 1] if step_num > 1 else []
157
+
158
+ print(
159
+ f"Truncating plan from {len(state['plan'])} to {len(truncated_plan)} steps"
160
+ )
161
+ print(f"Removed step {step_num} and beyond")
162
+ print("=" * 60)
163
+
164
+ # Return the truncated plan (replaces entire plan, not appends)
165
+ return {
166
+ "plan": truncated_plan,
167
+ "messages": [AIMessage(content=f"Backtracked to step {step_num - 1}")],
168
+ "next": "planner",
169
+ }
170
+
171
+ except Exception as e:
172
+ print(f"✗ Backtrack error: {str(e)}")
173
+ return {
174
+ "messages": [AIMessage(content=f"Backtrack error: {str(e)}, continuing")],
175
+ "next": "planner",
176
+ }
177
+
178
+
179
+ # Standalone test function
180
+ def test_router(
181
+ llm, query: str, data_descriptions: dict, plan: list, execution_result: str
182
+ ):
183
+ """
184
+ Test the router agent independently.
185
+
186
+ Args:
187
+ llm: LLM instance
188
+ query: User query
189
+ data_descriptions: Dict of filename -> description
190
+ plan: Current plan steps
191
+ execution_result: Result from code execution
192
+
193
+ Returns:
194
+ Dictionary with router results
195
+ """
196
+ # Create minimal test state
197
+ test_state = {
198
+ "llm": llm,
199
+ "query": query,
200
+ "data_descriptions": data_descriptions,
201
+ "plan": plan,
202
+ "current_code": "",
203
+ "execution_result": execution_result,
204
+ "is_sufficient": False,
205
+ "router_decision": "",
206
+ "iteration": 0,
207
+ "max_iterations": 20,
208
+ "messages": [],
209
+ "next": "router",
210
+ }
211
+
212
+ result = router_node(test_state)
213
+
214
+ print("\n" + "=" * 60)
215
+ print("ROUTER TEST RESULTS")
216
+ print("=" * 60)
217
+ print(f"Decision: {result.get('router_decision', 'unknown')}")
218
+ print(f"Next Node: {result.get('next', 'unknown')}")
219
+
220
+ return result
src/agents/verifier_agent.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Verifier Agent: Checks if the current plan and implementation sufficiently answer the query.
3
+
4
+ This agent evaluates whether the work done so far is enough to answer the original question.
5
+ """
6
+
7
+ from langchain_core.messages import AIMessage
8
+
9
+ from ..utils.formatters import format_plan, gemini_text
10
+ from ..utils.state import DSStarState
11
+
12
+
13
+ def verifier_node(state: DSStarState) -> dict:
14
+ """
15
+ Verifier Agent Node: Determines if plan sufficiently answers the query.
16
+
17
+ Analyzes:
18
+ - Original query
19
+ - Current plan
20
+ - Code implementation
21
+ - Execution results
22
+
23
+ Args:
24
+ state: Current DSStarState
25
+
26
+ Returns:
27
+ Dictionary with updated state fields:
28
+ - is_sufficient: Boolean indicating if work is complete
29
+ - messages: Agent communication messages
30
+ - next: "finalyzer" if sufficient, "router" if not
31
+ """
32
+ print("=" * 60)
33
+ print("VERIFIER AGENT STARTING...")
34
+ print("=" * 60)
35
+
36
+ plan_text = format_plan(state["plan"])
37
+
38
+ prompt = f"""You are an expert data analyst verifier.
39
+
40
+ Original Question: {state["query"]}
41
+
42
+ Current Plan:
43
+ {plan_text}
44
+
45
+ Implementation Code:
46
+ {state["current_code"]}
47
+
48
+ Execution Result:
49
+ {state["execution_result"][:1000]}
50
+
51
+ Task: Verify if this plan and implementation are SUFFICIENT to fully answer the question.
52
+
53
+ Consider:
54
+ - Does the plan address all aspects of the question?
55
+ - Does the execution result contain the answer?
56
+ - Is any additional analysis needed?
57
+
58
+ Answer with ONLY one word: "Yes" or "No"
59
+ - "Yes" if sufficient to answer the question
60
+ - "No" if more analysis is needed"""
61
+
62
+ try:
63
+ # Get LLM response
64
+ response = state["llm"].invoke(prompt)
65
+
66
+ # Handle different response formats
67
+ if hasattr(response, "content") and isinstance(response.content, list):
68
+ response_text = gemini_text(response)
69
+ elif hasattr(response, "content"):
70
+ response_text = response.content
71
+ else:
72
+ response_text = str(response)
73
+
74
+ response_lower = response_text.strip().lower()
75
+ is_sufficient = "yes" in response_lower
76
+
77
+ status = "SUFFICIENT ✓" if is_sufficient else "INSUFFICIENT ✗"
78
+ print(f"\nVerification Result: {status}")
79
+ print("=" * 60)
80
+
81
+ next_node = "finalyzer" if is_sufficient else "router"
82
+
83
+ return {
84
+ "is_sufficient": is_sufficient,
85
+ "messages": [
86
+ AIMessage(
87
+ content=f"Verification: {'Sufficient' if is_sufficient else 'Insufficient'}"
88
+ )
89
+ ],
90
+ "next": next_node,
91
+ }
92
+
93
+ except Exception as e:
94
+ # On error, assume insufficient and continue
95
+ print(f"\n✗ Verifier error: {str(e)}")
96
+ print("Defaulting to insufficient, continuing...")
97
+ return {
98
+ "is_sufficient": False,
99
+ "messages": [AIMessage(content=f"Verifier error: {str(e)}, continuing...")],
100
+ "next": "router",
101
+ }
102
+
103
+
104
+ # Standalone test function
105
+ def test_verifier(llm, query: str, plan: list, code: str, execution_result: str):
106
+ """
107
+ Test the verifier agent independently.
108
+
109
+ Args:
110
+ llm: LLM instance
111
+ query: User query
112
+ plan: List of plan steps
113
+ code: Generated code
114
+ execution_result: Result from code execution
115
+
116
+ Returns:
117
+ Dictionary with verifier results
118
+ """
119
+ # Create minimal test state
120
+ test_state = {
121
+ "llm": llm,
122
+ "query": query,
123
+ "data_descriptions": {},
124
+ "plan": plan,
125
+ "current_code": code,
126
+ "execution_result": execution_result,
127
+ "is_sufficient": False,
128
+ "router_decision": "",
129
+ "iteration": 0,
130
+ "max_iterations": 20,
131
+ "messages": [],
132
+ "next": "verifier",
133
+ }
134
+
135
+ result = verifier_node(test_state)
136
+
137
+ print("\n" + "=" * 60)
138
+ print("VERIFIER TEST RESULTS")
139
+ print("=" * 60)
140
+ print(f"Is Sufficient: {result.get('is_sufficient', False)}")
141
+ print(f"Next Node: {result.get('next', 'unknown')}")
142
+
143
+ return result
src/config/__init__.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ """Configuration package."""
2
+
3
+ from .llm_config import DEFAULT_CONFIG, get_llm
4
+
5
+ __all__ = ["get_llm", "DEFAULT_CONFIG"]
src/config/llm_config.py ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration for DS-STAR system.
3
+ Centralizes LLM setup and system parameters.
4
+ """
5
+
6
+ import os
7
+ from typing import Optional
8
+
9
+ from dotenv import load_dotenv
10
+
11
+ load_dotenv()
12
+
13
+
14
+ def get_llm(
15
+ provider: str = "google",
16
+ model: Optional[str] = None,
17
+ api_key: Optional[str] = None,
18
+ temperature: float = 0,
19
+ base_url: Optional[str] = None,
20
+ ):
21
+ """
22
+ Get configured LLM instance.
23
+
24
+ Args:
25
+ provider: LLM provider ("google", "openai", "anthropic")
26
+ model: Model name (uses default if not specified)
27
+ api_key: API key (uses environment variable if not specified)
28
+ temperature: Temperature for generation
29
+ base_url: Custom base URL for OpenAI-compatible APIs
30
+
31
+ Returns:
32
+ Configured LLM instance
33
+ """
34
+ if provider == "google":
35
+ from langchain_google_genai import ChatGoogleGenerativeAI
36
+
37
+ default_model = "gemini-flash-latest"
38
+ api_key = api_key or os.getenv("GOOGLE_API_KEY", "")
39
+
40
+ return ChatGoogleGenerativeAI(
41
+ model=model or default_model,
42
+ temperature=temperature,
43
+ google_api_key=api_key,
44
+ )
45
+
46
+ elif provider == "openai":
47
+ from langchain_openai import ChatOpenAI
48
+
49
+ default_model = "gpt-4"
50
+ api_key = api_key or os.getenv("OPENAI_API_KEY")
51
+
52
+ # Use provided base_url, then env var, then default
53
+ effective_base_url = base_url or os.getenv(
54
+ "LLM_BASE_URL",
55
+ "https://api.openai.com/v1",
56
+ )
57
+
58
+ return ChatOpenAI(
59
+ model=model or default_model,
60
+ temperature=temperature,
61
+ api_key=api_key,
62
+ base_url=effective_base_url,
63
+ )
64
+
65
+ elif provider == "anthropic":
66
+ from langchain_anthropic import ChatAnthropic
67
+
68
+ default_model = "claude-3-5-sonnet-20241022"
69
+ api_key = api_key or os.getenv("ANTHROPIC_API_KEY")
70
+
71
+ return ChatAnthropic(
72
+ model=model or default_model,
73
+ temperature=temperature,
74
+ api_key=api_key,
75
+ )
76
+
77
+ else:
78
+ raise ValueError(
79
+ f"Unknown provider: {provider}. Choose from: google, openai, anthropic"
80
+ )
81
+
82
+
83
+ # Default configuration
84
+ DEFAULT_CONFIG = {
85
+ "max_iterations": 20,
86
+ "provider": "openai",
87
+ "model": "google/gemini-2.5-flash",
88
+ "temperature": 0,
89
+ "data_dir": "data/",
90
+ }
src/graph.py ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DS-STAR Graph: Connects all agents into a workflow.
3
+
4
+ This module builds the LangGraph StateGraph that orchestrates the multi-agent system.
5
+ """
6
+
7
+ from typing import Literal
8
+
9
+ from langgraph.checkpoint.memory import MemorySaver
10
+ from langgraph.graph import END, StateGraph
11
+
12
+ from .agents.analyzer_agent import analyzer_node
13
+ from .agents.coder_agent import coder_node
14
+ from .agents.finalyzer_agent import finalyzer_node
15
+ from .agents.planner_agent import planner_node
16
+ from .agents.router_agent import backtrack_node, router_node
17
+ from .agents.verifier_agent import verifier_node
18
+ from .utils.state import DSStarState
19
+
20
+ # ==================== CONDITIONAL ROUTING FUNCTIONS ====================
21
+
22
+
23
+ def route_after_analyzer(state: DSStarState) -> Literal["planner", "__end__"]:
24
+ """
25
+ Route after analyzer based on success.
26
+
27
+ If analyzer found errors, end workflow.
28
+ Otherwise, proceed to planner.
29
+ """
30
+ if "error" in state.get("data_descriptions", {}):
31
+ return "__end__"
32
+ return state.get("next", "planner")
33
+
34
+
35
+ def route_after_planner(state: DSStarState) -> Literal["coder", "__end__"]:
36
+ """Route after planner to coder."""
37
+ return state.get("next", "coder")
38
+
39
+
40
+ def route_after_coder(state: DSStarState) -> Literal["verifier", "__end__"]:
41
+ """Route after coder to verifier."""
42
+ return state.get("next", "verifier")
43
+
44
+
45
+ def route_after_verifier(
46
+ state: DSStarState,
47
+ ) -> Literal["router", "finalyzer", "__end__"]:
48
+ """
49
+ Route after verifier based on sufficiency and iteration count.
50
+
51
+ If max iterations reached, go to finalyzer.
52
+ If sufficient, go to finalyzer.
53
+ Otherwise, go to router to decide next action.
54
+ """
55
+ # Check max iterations
56
+ if state["iteration"] >= state["max_iterations"]:
57
+ print(f"\n⚠ Max iterations ({state['max_iterations']}) reached, finalizing...")
58
+ return "finalyzer"
59
+
60
+ return state.get("next", "router")
61
+
62
+
63
+ def route_after_router(state: DSStarState) -> Literal["planner", "backtrack"]:
64
+ """
65
+ Route after router based on decision.
66
+
67
+ If router says "Add Step", go to planner.
68
+ If router says "Step N", go to backtrack.
69
+ """
70
+ return state.get("next", "planner")
71
+
72
+
73
+ # ==================== GRAPH BUILDER ====================
74
+
75
+
76
+ def build_ds_star_graph(llm, max_iterations: int = 20):
77
+ """
78
+ Constructs the LangGraph workflow for DS-STAR.
79
+
80
+ The workflow follows this pattern:
81
+ 1. Analyzer: Analyze data files (runs once)
82
+ 2. Planner: Generate next plan step
83
+ 3. Coder: Implement plan as code
84
+ 4. Verifier: Check if sufficient
85
+ 5. If insufficient:
86
+ a. Router: Decide to add step or backtrack
87
+ b. Backtrack (optional): Remove wrong steps
88
+ c. Go back to Planner
89
+ 6. If sufficient:
90
+ Finalyzer: Create polished final solution
91
+
92
+ Args:
93
+ llm: LLM instance (e.g., ChatOpenAI, ChatGoogleGenerativeAI)
94
+ max_iterations: Maximum refinement iterations (default: 20)
95
+
96
+ Returns:
97
+ Compiled LangGraph application with checkpointing
98
+ """
99
+ # Initialize graph with state schema
100
+ workflow = StateGraph(DSStarState)
101
+
102
+ # Add all agent nodes
103
+ workflow.add_node("analyzer", analyzer_node)
104
+ workflow.add_node("planner", planner_node)
105
+ workflow.add_node("coder", coder_node)
106
+ workflow.add_node("verifier", verifier_node)
107
+ workflow.add_node("router", router_node)
108
+ workflow.add_node("backtrack", backtrack_node)
109
+ workflow.add_node("finalyzer", finalyzer_node)
110
+
111
+ # Set entry point
112
+ workflow.set_entry_point("analyzer")
113
+
114
+ # Add conditional edges with proper routing
115
+ workflow.add_conditional_edges(
116
+ "analyzer", route_after_analyzer, {"planner": "planner", "__end__": END}
117
+ )
118
+
119
+ workflow.add_conditional_edges(
120
+ "planner", route_after_planner, {"coder": "coder", "__end__": END}
121
+ )
122
+
123
+ workflow.add_conditional_edges(
124
+ "coder", route_after_coder, {"verifier": "verifier", "__end__": END}
125
+ )
126
+
127
+ workflow.add_conditional_edges(
128
+ "verifier",
129
+ route_after_verifier,
130
+ {"router": "router", "finalyzer": "finalyzer", "__end__": END},
131
+ )
132
+
133
+ workflow.add_conditional_edges(
134
+ "router", route_after_router, {"planner": "planner", "backtrack": "backtrack"}
135
+ )
136
+
137
+ workflow.add_edge("backtrack", "planner")
138
+ workflow.add_edge("finalyzer", END)
139
+
140
+ # Add memory/checkpointing
141
+ memory = MemorySaver()
142
+
143
+ # Compile graph
144
+ app = workflow.compile(checkpointer=memory)
145
+
146
+ return app
147
+
148
+
149
+ def create_initial_state(query: str, llm, max_iterations: int = 20) -> DSStarState:
150
+ """
151
+ Create initial state for the DS-STAR workflow.
152
+
153
+ Args:
154
+ query: User's question to answer
155
+ llm: LLM instance
156
+ max_iterations: Maximum refinement iterations
157
+
158
+ Returns:
159
+ Initial DSStarState dictionary
160
+ """
161
+ return {
162
+ "query": query,
163
+ "data_descriptions": {},
164
+ "plan": [],
165
+ "current_code": "",
166
+ "execution_result": "",
167
+ "is_sufficient": False,
168
+ "router_decision": "",
169
+ "iteration": 0,
170
+ "max_iterations": max_iterations,
171
+ "messages": [],
172
+ "next": "analyzer",
173
+ "llm": llm,
174
+ }
175
+
176
+
177
+ def run_ds_star(
178
+ query: str, llm, max_iterations: int = 20, thread_id: str = "ds-star-1"
179
+ ):
180
+ """
181
+ Run the complete DS-STAR workflow.
182
+
183
+ Args:
184
+ query: User's question to answer
185
+ llm: LLM instance
186
+ max_iterations: Maximum refinement iterations
187
+ thread_id: Unique thread ID for checkpointing
188
+
189
+ Returns:
190
+ Final state after workflow completion
191
+ """
192
+ print("=" * 60)
193
+ print("DS-STAR MULTI-AGENT SYSTEM")
194
+ print("=" * 60)
195
+ print(f"Query: {query}")
196
+ print(f"Max Iterations: {max_iterations}")
197
+ print("=" * 60)
198
+
199
+ # Build graph
200
+ app = build_ds_star_graph(llm, max_iterations)
201
+
202
+ # Create initial state
203
+ initial_state = create_initial_state(query, llm, max_iterations)
204
+
205
+ # Run with checkpointing
206
+ config = {"configurable": {"thread_id": thread_id}}
207
+
208
+ try:
209
+ # Execute the workflow
210
+ final_state = app.invoke(initial_state, config)
211
+
212
+ # Display results
213
+ print("\n" + "=" * 60)
214
+ print("FINAL SOLUTION")
215
+ print("=" * 60)
216
+ print("\nGenerated Code:")
217
+ print("-" * 60)
218
+ print(final_state["current_code"])
219
+ print("\n" + "-" * 60)
220
+ print("Execution Result:")
221
+ print("-" * 60)
222
+ print(final_state["execution_result"])
223
+ print("=" * 60)
224
+
225
+ return final_state
226
+
227
+ except Exception as e:
228
+ print(f"\n✗ Error during execution: {str(e)}")
229
+ import traceback
230
+
231
+ traceback.print_exc()
232
+ return None
src/utils/__init__.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Utility modules for DS-STAR system."""
2
+
3
+ from .code_execution import execute_code_safely, execute_with_debug
4
+ from .formatters import extract_code, format_data_descriptions, format_plan, gemini_text
5
+ from .state import DSStarState, PlanStep
6
+
7
+ __all__ = [
8
+ "DSStarState",
9
+ "PlanStep",
10
+ "extract_code",
11
+ "format_data_descriptions",
12
+ "format_plan",
13
+ "gemini_text",
14
+ "execute_code_safely",
15
+ "execute_with_debug",
16
+ ]
src/utils/code_execution.py ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Code execution utilities with debugging capabilities.
3
+ """
4
+
5
+ import os
6
+ import subprocess
7
+ import sys
8
+ from typing import Tuple
9
+
10
+
11
+ def execute_code_safely(code: str, timeout: int = 30) -> Tuple[bool, str, str]:
12
+ """
13
+ Execute Python code safely in a subprocess and capture output.
14
+
15
+ Args:
16
+ code: Python code to execute
17
+ timeout: Maximum execution time in seconds
18
+
19
+ Returns:
20
+ Tuple of (success: bool, stdout: str, stderr: str)
21
+ """
22
+ temp_file = "temp_script.py"
23
+
24
+ try:
25
+ # Write code to temporary file
26
+ with open(temp_file, "w", encoding="utf-8") as f:
27
+ f.write(code)
28
+
29
+ # Execute with subprocess
30
+ result = subprocess.run(
31
+ [sys.executable, temp_file], capture_output=True, text=True, timeout=timeout
32
+ )
33
+
34
+ success = result.returncode == 0
35
+ return success, result.stdout, result.stderr
36
+
37
+ except subprocess.TimeoutExpired:
38
+ return False, "", f"Execution timed out after {timeout} seconds"
39
+ except Exception as e:
40
+ return False, "", f"Execution error: {str(e)}"
41
+ finally:
42
+ # Clean up temp file
43
+ if os.path.exists(temp_file):
44
+ try:
45
+ os.remove(temp_file)
46
+ except Exception as e:
47
+ print(f"Warning: Failed to remove temp file: {str(e)}")
48
+
49
+
50
+ def execute_with_debug(
51
+ code: str, llm, is_analysis: bool, data_context: str = "", max_attempts: int = 3
52
+ ) -> str:
53
+ """
54
+ Execute code with automatic debugging via LLM.
55
+
56
+ If execution fails, the LLM is asked to fix the error.
57
+ This repeats for up to max_attempts.
58
+
59
+ Args:
60
+ code: Python code to execute
61
+ llm: LLM instance for debugging
62
+ is_analysis: Whether this is data analysis stage (simpler prompts)
63
+ data_context: Context about available data files
64
+ max_attempts: Maximum debugging attempts
65
+
66
+ Returns:
67
+ Execution output or error message
68
+ """
69
+ from .formatters import extract_code
70
+
71
+ for attempt in range(max_attempts):
72
+ success, stdout, stderr = execute_code_safely(code)
73
+
74
+ if success:
75
+ return stdout if stdout else "Code executed successfully (no output)"
76
+
77
+ # Debug the error
78
+ print(f" Debug attempt {attempt + 1}/{max_attempts}")
79
+
80
+ if is_analysis:
81
+ debug_prompt = f"""Fix this Python code error:
82
+ {code}
83
+
84
+ Error:
85
+ {stderr}
86
+
87
+ Requirements:
88
+ - Fix the error
89
+ - Keep the same functionality
90
+ - No try-except blocks
91
+ - All files are in 'data/' directory
92
+ - Provide ONLY the corrected code in a markdown code block"""
93
+ else:
94
+ debug_prompt = f"""Fix this Python code error:
95
+
96
+ Available Data:
97
+ {data_context}
98
+
99
+ Code with error:
100
+ {code}
101
+
102
+ Error:
103
+ {stderr}
104
+
105
+ Requirements:
106
+ - Fix the error using data context
107
+ - Keep the same functionality
108
+ - No try-except blocks
109
+ - Provide ONLY the corrected code in a markdown code block"""
110
+
111
+ try:
112
+ response = llm.invoke(debug_prompt)
113
+
114
+ # Handle Gemini response format
115
+ if hasattr(response, "content") and isinstance(response.content, list):
116
+ # Gemini returns list of dicts
117
+ from .formatters import gemini_text
118
+
119
+ response_text = gemini_text(response)
120
+ elif hasattr(response, "content"):
121
+ response_text = response.content
122
+ else:
123
+ response_text = str(response)
124
+
125
+ code = extract_code(response_text)
126
+ except Exception as e:
127
+ return f"Debugging failed: {str(e)}"
128
+
129
+ return f"Failed after {max_attempts} attempts. Last error:\n{stderr}"
src/utils/formatters.py ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Utility functions for formatting and extracting data.
3
+ """
4
+
5
+
6
+ def gemini_text(res) -> str:
7
+ """
8
+ Extract text from Gemini API response.
9
+
10
+ Args:
11
+ res: Gemini API response object
12
+
13
+ Returns:
14
+ Concatenated text from all text parts
15
+ """
16
+ return "".join(part["text"] for part in res.content if part.get("type") == "text")
17
+
18
+
19
+ def extract_code(response: str) -> str:
20
+ """
21
+ Extract code from markdown code blocks.
22
+ Handles ```python and ``` formats.
23
+
24
+ Args:
25
+ response: Text response potentially containing code blocks
26
+
27
+ Returns:
28
+ Extracted code string
29
+ """
30
+ # Try to find python code block first
31
+ if "```python" in response:
32
+ parts = response.split("```python", 1)
33
+ if len(parts) > 1:
34
+ code_part = parts[1].split("```", 1)
35
+ if len(code_part) > 0:
36
+ return code_part[0].strip()
37
+
38
+ # Try generic code block
39
+ elif "```" in response:
40
+ parts = response.split("```", 1)
41
+ if len(parts) > 1:
42
+ code_part = parts[1].split("```", 1)
43
+ if len(code_part) > 0:
44
+ # Remove language identifier if present
45
+ code = code_part[0].strip()
46
+ # Remove first line if it's a language identifier
47
+ lines = code.split("\n")
48
+ if lines and lines[0].strip() in ["python", "py", "python3"]:
49
+ return "\n".join(lines[1:]).strip()
50
+ return code
51
+
52
+ # If no code blocks found, return as is
53
+ return response.strip()
54
+
55
+
56
+ def format_data_descriptions(descriptions: dict) -> str:
57
+ """
58
+ Format data descriptions dictionary into readable string.
59
+
60
+ Args:
61
+ descriptions: Dict mapping filename to description
62
+
63
+ Returns:
64
+ Formatted string with file descriptions
65
+ """
66
+ if not descriptions:
67
+ return "No data files analyzed yet."
68
+
69
+ formatted_parts = []
70
+ for filename, description in descriptions.items():
71
+ formatted_parts.append(f"## File: {filename}\n{description}\n")
72
+
73
+ return "\n".join(formatted_parts)
74
+
75
+
76
+ def format_plan(plan: list) -> str:
77
+ """
78
+ Format plan steps into readable string.
79
+
80
+ Args:
81
+ plan: List of PlanStep dictionaries
82
+
83
+ Returns:
84
+ Formatted plan string
85
+ """
86
+ if not plan:
87
+ return "No plan steps yet."
88
+
89
+ return "\n".join([f"{i + 1}. {step['description']}" for i, step in enumerate(plan)])
src/utils/state.py ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ State schema for DS-STAR multi-agent system.
3
+ Defines the centralized state structure shared across all agents.
4
+ """
5
+
6
+ import operator
7
+ from typing import Annotated, List, TypedDict
8
+
9
+ from langchain_core.messages import AIMessage, HumanMessage
10
+
11
+
12
+ class PlanStep(TypedDict):
13
+ """Individual plan step with number and description"""
14
+
15
+ step_number: int
16
+ description: str
17
+
18
+
19
+ class DSStarState(TypedDict):
20
+ """
21
+ Centralized state for DS-STAR pipeline.
22
+
23
+ This state is passed between all agents in the graph.
24
+ Uses reducer pattern for message accumulation.
25
+ """
26
+
27
+ # User input
28
+ query: str
29
+
30
+ # Data file descriptions (from Stage 1 - Analyzer)
31
+ data_descriptions: dict[str, str]
32
+
33
+ # Current plan - list of completed steps
34
+ # NO REDUCER - we need full control for backtracking
35
+ plan: List[PlanStep]
36
+
37
+ # Code and execution
38
+ current_code: str
39
+ execution_result: str
40
+
41
+ # Verification and routing
42
+ is_sufficient: bool
43
+ router_decision: str # "Add Step" or "Step N"
44
+
45
+ # Iteration tracking
46
+ iteration: int
47
+ max_iterations: int
48
+
49
+ # Messages for agent communication (accumulated with reducer)
50
+ messages: Annotated[List[HumanMessage | AIMessage], operator.add]
51
+
52
+ # Next node routing (for internal control flow)
53
+ next: str
54
+
55
+ # LLM instance (shared across all agents)
56
+ llm: object
tests/test_complete_workflow.py ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Comprehensive test for DS-STAR workflow.
3
+
4
+ This test runs the complete multi-agent system to verify:
5
+ 1. All agents are properly connected
6
+ 2. The graph routing works correctly
7
+ 3. The workflow can complete successfully
8
+ """
9
+
10
+ import os
11
+ import sys
12
+
13
+ from dotenv import load_dotenv
14
+
15
+ load_dotenv()
16
+ LLM_MODEL = os.getenv("LLM_MODEL", "google/gemini-2.5-flash")
17
+ LLM_API_KEY = os.getenv("LLM_API_KEY", "")
18
+
19
+ # Add parent directory to path
20
+ sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
21
+
22
+ from src.config import get_llm
23
+ from src.graph import run_ds_star
24
+
25
+
26
+ def test_complete_workflow():
27
+ """Test the complete DS-STAR workflow."""
28
+ print("=" * 60)
29
+ print("COMPREHENSIVE DS-STAR WORKFLOW TEST")
30
+ print("=" * 60)
31
+
32
+ # Configuration
33
+ query = "What percentage of transactions use credit cards?"
34
+ max_iterations = 10 # Reduced for testing
35
+
36
+ print(f"\nTest Query: {query}")
37
+ print(f"Max Iterations: {max_iterations}")
38
+ print()
39
+
40
+ try:
41
+ # Initialize LLM
42
+ print("Initializing LLM (Gemini 1.5 Flash)...")
43
+ llm = get_llm(
44
+ provider="openai",
45
+ model=LLM_MODEL,
46
+ temperature=0,
47
+ api_key=LLM_API_KEY,
48
+ )
49
+
50
+ print("✓ LLM initialized")
51
+ print()
52
+
53
+ # Run workflow
54
+ print("Starting DS-STAR workflow...")
55
+ print("=" * 60)
56
+
57
+ final_state = run_ds_star(
58
+ query=query,
59
+ llm=llm,
60
+ max_iterations=max_iterations,
61
+ thread_id="test-session",
62
+ )
63
+
64
+ # Verify results
65
+ print("\n" + "=" * 60)
66
+ print("TEST VERIFICATION")
67
+ print("=" * 60)
68
+
69
+ if final_state is None:
70
+ print("❌ FAILED: Workflow returned None")
71
+ return False
72
+
73
+ # Check that we got results
74
+ checks = [
75
+ ("Data descriptions", len(final_state.get("data_descriptions", {})) > 0),
76
+ ("Plan generated", len(final_state.get("plan", [])) > 0),
77
+ ("Code generated", len(final_state.get("current_code", "")) > 0),
78
+ ("Execution result", len(final_state.get("execution_result", "")) > 0),
79
+ ]
80
+
81
+ all_passed = True
82
+ for check_name, passed in checks:
83
+ status = "✓" if passed else "✗"
84
+ print(f"{status} {check_name}: {'PASS' if passed else 'FAIL'}")
85
+ all_passed = all_passed and passed
86
+
87
+ print("\n" + "=" * 60)
88
+ if all_passed:
89
+ print("✅ ALL TESTS PASSED")
90
+ print("=" * 60)
91
+ return True
92
+ else:
93
+ print("❌ SOME TESTS FAILED")
94
+ print("=" * 60)
95
+ return False
96
+
97
+ except Exception as e:
98
+ print(f"\n❌ TEST FAILED WITH EXCEPTION: {str(e)}")
99
+ import traceback
100
+
101
+ traceback.print_exc()
102
+ return False
103
+
104
+
105
+ def main():
106
+ """Run the test."""
107
+ success = test_complete_workflow()
108
+ return 0 if success else 1
109
+
110
+
111
+ if __name__ == "__main__":
112
+ sys.exit(main())
uv.lock ADDED
The diff for this file is too large to render. See raw diff