Spaces:
Runtime error
Runtime error
Commit
Β·
73445ba
1
Parent(s):
5d24c5c
fix readme
Browse files
README.md
CHANGED
|
@@ -5,7 +5,7 @@ colorFrom: "yellow"
|
|
| 5 |
colorTo: "purple"
|
| 6 |
sdk: "gradio"
|
| 7 |
python_version: "3.10"
|
| 8 |
-
start_command: "
|
| 9 |
short_description: "A Gradio RAG app for querying the poetry of Allama Iqbal."
|
| 10 |
tags:
|
| 11 |
- rag
|
|
@@ -17,15 +17,18 @@ tags:
|
|
| 17 |
|
| 18 |
# Iqbal Poetry RAG System
|
| 19 |
|
| 20 |
-
A Retrieval-Augmented Generation (RAG) system for exploring and querying the poetry of Allama Iqbal. This project leverages vector search and large language models (LLMs) to answer questions about Iqbal's poetry, providing relevant poem excerpts as context.
|
|
|
|
|
|
|
| 21 |
|
| 22 |
---
|
| 23 |
|
| 24 |
## π Hugging Face Spaces Ready
|
| 25 |
|
|
|
|
| 26 |
This project is ready to be deployed as a [Hugging Face Space](https://huggingface.co/spaces). The configuration block above (in YAML) tells Hugging Face how to launch the app:
|
| 27 |
- **sdk**: Uses Gradio for the web interface.
|
| 28 |
-
- **app_file**: Entry point for the app (`app
|
| 29 |
- **python_version**: Uses Python 3.10.
|
| 30 |
- **short_description**: Shown in the Space's thumbnail.
|
| 31 |
- **tags**: For discoverability.
|
|
@@ -36,7 +39,7 @@ To deploy, simply upload this repository to your Hugging Face account as a new S
|
|
| 36 |
|
| 37 |
## Features
|
| 38 |
|
| 39 |
-
- **Semantic Search**: Retrieve the most relevant poems for a given question using vector embeddings.
|
| 40 |
- **LLM-Powered Answers**: Generate answers using a language model, grounded in retrieved poem context.
|
| 41 |
- **Gradio Interface**: User-friendly web interface powered by [Gradio](https://gradio.app/).
|
| 42 |
- **Plug-and-Play Dataset**: The poetry dataset is already included in the repository, with all paths set up for immediate use.
|
|
@@ -52,6 +55,16 @@ To deploy, simply upload this repository to your Hugging Face account as a new S
|
|
| 52 |
|
| 53 |
- Python 3.9+
|
| 54 |
- [uv](https://github.com/astral-sh/uv) (a fast Python package installer, drop-in replacement for pip)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
### 1. Clone the repository
|
| 57 |
|
|
@@ -76,7 +89,7 @@ The poetry dataset is already included in the repository, and all file paths are
|
|
| 76 |
To launch the Gradio app locally:
|
| 77 |
|
| 78 |
```bash
|
| 79 |
-
python app
|
| 80 |
```
|
| 81 |
|
| 82 |
This will start a Gradio web interface in your browser, where you can enter your questions about Iqbal's poetry and receive contextually grounded answers.
|
|
@@ -88,33 +101,35 @@ This will start a Gradio web interface in your browser, where you can enter your
|
|
| 88 |
```
|
| 89 |
iqbal_poetry_rag/
|
| 90 |
β
|
| 91 |
-
βββ
|
| 92 |
-
β βββ RAGSystem.py
|
| 93 |
-
β βββ
|
| 94 |
-
β βββ config.py
|
| 95 |
β
|
| 96 |
βββ rag/
|
| 97 |
-
β βββ vector_store.py
|
| 98 |
-
β βββ retriever.py
|
| 99 |
-
β βββ llm.py
|
|
|
|
| 100 |
β
|
| 101 |
βββ utils/
|
| 102 |
-
β βββ error_handling.py
|
| 103 |
-
β βββ feedback_logger.py
|
| 104 |
β
|
| 105 |
-
βββ dataset
|
| 106 |
-
β βββ poems.json # Iqbal's poetry dataset (already included)
|
| 107 |
β
|
| 108 |
-
βββ requirements.txt
|
| 109 |
-
|
|
|
|
| 110 |
```
|
| 111 |
|
| 112 |
---
|
| 113 |
|
| 114 |
## Configuration
|
| 115 |
|
| 116 |
-
Edit `
|
| 117 |
-
|
|
|
|
| 118 |
- `SCORE_THRESHOLD`: Minimum similarity score for retrieved poems.
|
| 119 |
- `JSON_FILE_PATH`: Path to your poems data file (already set to the included dataset).
|
| 120 |
|
|
|
|
| 5 |
colorTo: "purple"
|
| 6 |
sdk: "gradio"
|
| 7 |
python_version: "3.10"
|
| 8 |
+
start_command: "python app.py"
|
| 9 |
short_description: "A Gradio RAG app for querying the poetry of Allama Iqbal."
|
| 10 |
tags:
|
| 11 |
- rag
|
|
|
|
| 17 |
|
| 18 |
# Iqbal Poetry RAG System
|
| 19 |
|
| 20 |
+
A Retrieval-Augmented Generation (RAG) system for exploring and querying the poetry of Allama Iqbal. This project leverages vector search and large language models (LLMs) to answer questions about Iqbal's poetry, providing relevant poem excerpts as context.
|
| 21 |
+
|
| 22 |
+
Note: On first run your will need to set up the vector embeddings store so the set up and initialization can take a few hours dependings on the performance of your PC.
|
| 23 |
|
| 24 |
---
|
| 25 |
|
| 26 |
## π Hugging Face Spaces Ready
|
| 27 |
|
| 28 |
+
### In Progress:
|
| 29 |
This project is ready to be deployed as a [Hugging Face Space](https://huggingface.co/spaces). The configuration block above (in YAML) tells Hugging Face how to launch the app:
|
| 30 |
- **sdk**: Uses Gradio for the web interface.
|
| 31 |
+
- **app_file**: Entry point for the app (`app.py`).
|
| 32 |
- **python_version**: Uses Python 3.10.
|
| 33 |
- **short_description**: Shown in the Space's thumbnail.
|
| 34 |
- **tags**: For discoverability.
|
|
|
|
| 39 |
|
| 40 |
## Features
|
| 41 |
|
| 42 |
+
- **Semantic Search**: Retrieve the most relevant poems and their themes for a given question using vector embeddings.
|
| 43 |
- **LLM-Powered Answers**: Generate answers using a language model, grounded in retrieved poem context.
|
| 44 |
- **Gradio Interface**: User-friendly web interface powered by [Gradio](https://gradio.app/).
|
| 45 |
- **Plug-and-Play Dataset**: The poetry dataset is already included in the repository, with all paths set up for immediate use.
|
|
|
|
| 55 |
|
| 56 |
- Python 3.9+
|
| 57 |
- [uv](https://github.com/astral-sh/uv) (a fast Python package installer, drop-in replacement for pip)
|
| 58 |
+
- HuggingFace account (https://huggingface.co/) (to use pretrained models)
|
| 59 |
+
- Ollama (https://ollama.com/) (to create vector embeddings)
|
| 60 |
+
|
| 61 |
+
```bash
|
| 62 |
+
# install Ollama
|
| 63 |
+
curl -sSfL https://ollama.ai/install.sh | sh
|
| 64 |
+
|
| 65 |
+
# pull a model
|
| 66 |
+
ollama pull llama3
|
| 67 |
+
```
|
| 68 |
|
| 69 |
### 1. Clone the repository
|
| 70 |
|
|
|
|
| 89 |
To launch the Gradio app locally:
|
| 90 |
|
| 91 |
```bash
|
| 92 |
+
python app.py
|
| 93 |
```
|
| 94 |
|
| 95 |
This will start a Gradio web interface in your browser, where you can enter your questions about Iqbal's poetry and receive contextually grounded answers.
|
|
|
|
| 101 |
```
|
| 102 |
iqbal_poetry_rag/
|
| 103 |
β
|
| 104 |
+
βββ interface/
|
| 105 |
+
β βββ RAGSystem.py # Main RAG system class
|
| 106 |
+
β βββ gradio_interface.py # Gradio app and its interface
|
| 107 |
+
β βββ config.py # Configuration (thresholds, file paths, etc.)
|
| 108 |
β
|
| 109 |
βββ rag/
|
| 110 |
+
β βββ vector_store.py # Vector store initialization and building
|
| 111 |
+
β βββ retriever.py # Retriever configuration
|
| 112 |
+
β βββ llm.py # LLM initialization and prompt management
|
| 113 |
+
β βββ embeddings.py # Embedding functionality for the RAG system uses Ollama
|
| 114 |
β
|
| 115 |
βββ utils/
|
| 116 |
+
β βββ error_handling.py # Error handling decorators
|
| 117 |
+
β βββ feedback_logger.py # (Optional) Feedback logging
|
| 118 |
β
|
| 119 |
+
βββ data/ # Iqbal's poetry dataset (already included)
|
|
|
|
| 120 |
β
|
| 121 |
+
βββ requirements.txt # Project dependencies
|
| 122 |
+
βββ app.py # Entry point for the app
|
| 123 |
+
βββ README.md # This file
|
| 124 |
```
|
| 125 |
|
| 126 |
---
|
| 127 |
|
| 128 |
## Configuration
|
| 129 |
|
| 130 |
+
Edit `interface/config.py` to set:
|
| 131 |
+
- `HUGGING_FACE_TOKEN`: Your personal huggingface token (this can be set up using dotenv. Create a .env file in the home folder and store it as
|
| 132 |
+
HUGGING_FACE_TOKEN = <YOUR_TOKEN>)
|
| 133 |
- `SCORE_THRESHOLD`: Minimum similarity score for retrieved poems.
|
| 134 |
- `JSON_FILE_PATH`: Path to your poems data file (already set to the included dataset).
|
| 135 |
|