add hints for placing visual input and thinking control

#2
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -121,6 +121,37 @@ Users can control the thinking mode by appending `/no_think` to queries:
121
  - **Non-thinking mode query**:
122
  *"Identify the text in the image. /no_think"*
123
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
124
  ---
125
 
126
  ## I. Introduction
 
121
  - **Non-thinking mode query**:
122
  *"Identify the text in the image. /no_think"*
123
 
124
+ ❗️Important: The `/no_think` command must be the very last part of user message, which means after `/no_think`, there shouldn't be any user content like image or video.
125
+
126
+ #### Placing Visual Input
127
+ For prompts with a single image or video, always place the visual media before the text. For example:
128
+
129
+ ✅ Good:
130
+ ```
131
+ messages = [
132
+ {
133
+ "role": "user",
134
+ "content": [
135
+ {"type": "image", "image": image_path},
136
+ {"type": "text", "text": "Describe the image. /no_think"},
137
+ ],
138
+ }
139
+ ]
140
+ ```
141
+
142
+ ❌ Bad:
143
+ ```
144
+ messages = [
145
+ {
146
+ "role": "user",
147
+ "content": [
148
+ {"type": "text", "text": "Describe the image. /no_think"},
149
+ {"type": "image", "image": image_path},
150
+ ],
151
+ }
152
+ ]
153
+ ```
154
+
155
  ---
156
 
157
  ## I. Introduction