KaiquanMah commited on
Commit
ca01ef5
·
verified ·
1 Parent(s): 021d8cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -2
README.md CHANGED
@@ -44,7 +44,7 @@ requirements.txt
44
  # Design Considerations and Constraints
45
  * A Streamlit app
46
  * A free LLM API which allows us to test without running into concerns with running out of credits - gemini-2.5-flash-preview-04-17-thinking
47
- * A way to send relevant information to the LLM
48
  * We tried sending the full dataset in a dataframe containing approx 4.1m tokens. This failed because Gemini only accepts a maximum of 1m tokens as input
49
  * We tried splitting the full dataset into 5 equal chunks (worked), uploaded each chunk for the chat interface (worked), but unable to use all 5 chunks (ie the full dataset) in the chat (failed)
50
  * Since we were unable to send the huge dataset to Gemini, we explored sending a summary of the dataset
@@ -69,4 +69,24 @@ requirements.txt
69
 
70
  # Architecture
71
  * With the above design considerations and constraints, we created the chatbot with the following architecture
72
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  # Design Considerations and Constraints
45
  * A Streamlit app
46
  * A free LLM API which allows us to test without running into concerns with running out of credits - gemini-2.5-flash-preview-04-17-thinking
47
+ * A way to send relevant information/grounding context to the LLM
48
  * We tried sending the full dataset in a dataframe containing approx 4.1m tokens. This failed because Gemini only accepts a maximum of 1m tokens as input
49
  * We tried splitting the full dataset into 5 equal chunks (worked), uploaded each chunk for the chat interface (worked), but unable to use all 5 chunks (ie the full dataset) in the chat (failed)
50
  * Since we were unable to send the huge dataset to Gemini, we explored sending a summary of the dataset
 
69
 
70
  # Architecture
71
  * With the above design considerations and constraints, we created the chatbot with the following architecture
72
+ ![Architecture_Diagram_Yair_Kai](documentation/Architecture_Diagram_Yair_Kai.png)
73
+ * Preparation steps
74
+ * For dataset and model
75
+ * Download dataset
76
+ * Exploratory data analysis
77
+ * Prepare grounding information
78
+ * Prepare system prompt
79
+ * Gemini API chat configuration
80
+ * Tool call - GoogleSearch
81
+ * For Streamlit app
82
+ * Configure chat interface
83
+ * During normal usage of the chatbot
84
+ * User loads the chat interface and picks a question from the 'Example questions' section
85
+ * User types their question in the 'Ask a question' box and submits their question to Gemini (our LLM)
86
+ * The system prompt (which contains the dataframe summary, i.e. grounding information) also gets sent to Gemini
87
+ * Gemini can
88
+ * Analyse the user's question and respond directly, sending the response back to our Streamlit app, which processes the response and shows on the chat box
89
+ * Alternatively assess whether a tool call is required
90
+ * If Gemini decides that a tool call is required to answer an 'out of domain' question', Gemini make 1 or more calls to the GoogleSearch tool
91
+ * Then Gemini will formulate a response, sends that response to our Streamlit app, to process and show on the chat box
92
+ * We also extract the search links and show below the LLM's answer, for users to verify the LLM's response if they wish