ChristopherMarais commited on
Commit
143a646
·
verified ·
1 Parent(s): 3738e5d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -322
README.md CHANGED
@@ -1,344 +1,66 @@
1
  ---
2
  license: mit
3
  ---
4
- # Evaluation Report for yolov11x_bb_detect_model
5
 
6
- **Tasks:** Single-class Object Detection, Feature extraction
 
 
 
7
 
8
- ## Evaluation Notes
9
-
10
- This model does not identify individual species but detects a single category of object.
11
-
12
- The evaluation was performed on a single-class basis using the text prompt: **'bark_beetle'**.
13
-
14
- ### Mantel Correlation Explanation
15
-
16
- The Mantel R statistic is calculated by comparing the distances between clustering centroids of different species to their phylogenetic distances. This helps determine if the model's learned feature representations correlate with the evolutionary relationships between species.
17
-
18
- ## Object Classification Performance
19
-
20
- **mAP@[.5:.95]:** 0.933
21
-
22
- ### mAP per IoU Threshold
23
-
24
- | IoU Threshold | mAP |
25
- |:----------------|---------:|
26
- | [email protected] | 0.988415 |
27
- | [email protected] | 0.987142 |
28
- | [email protected] | 0.986385 |
29
- | [email protected] | 0.985255 |
30
- | [email protected] | 0.984068 |
31
- | [email protected] | 0.981365 |
32
- | [email protected] | 0.977609 |
33
- | [email protected] | 0.970177 |
34
- | [email protected] | 0.943694 |
35
- | [email protected] | 0.521599 |
36
-
37
- ### Average Precision per Class (at last IoU threshold)
38
-
39
- | Class | AP |
40
- |:------------|---------:|
41
- | bark_beetle | 0.521599 |
42
-
43
- ### Classification Metrics per IoU Threshold
44
-
45
- #### IoU Threshold: iou_0.50
46
-
47
- - **Accuracy:** 0.992
48
- - **Balanced Accuracy:** 0.992
49
- - **Macro Precision:** 1.000
50
- - **Macro Recall:** 0.992
51
- - **Macro F1 Score:** 0.996
52
- - **Cohen's Kappa:** 0.000
53
- - **Matthews Corrcoef:** 0.000
54
-
55
- ##### Confusion Matrix
56
-
57
- ```
58
- Predicted Label 0
59
- True Label
60
- 0 16340
61
- ```
62
-
63
- ##### Classification Report
64
-
65
- ```
66
- precision recall f1-score support
67
- 0 1.0 0.991505 0.995734 16480.0
68
- micro avg 1.0 0.991505 0.995734 16480.0
69
- macro avg 1.0 0.991505 0.995734 16480.0
70
- weighted avg 1.0 0.991505 0.995734 16480.0
71
- ```
72
-
73
- #### IoU Threshold: iou_0.55
74
-
75
- - **Accuracy:** 0.990
76
- - **Balanced Accuracy:** 0.990
77
- - **Macro Precision:** 1.000
78
- - **Macro Recall:** 0.990
79
- - **Macro F1 Score:** 0.995
80
- - **Cohen's Kappa:** 0.000
81
- - **Matthews Corrcoef:** 0.000
82
-
83
- ##### Confusion Matrix
84
-
85
- ```
86
- Predicted Label 0
87
- True Label
88
- 0 16315
89
- ```
90
-
91
- ##### Classification Report
92
-
93
- ```
94
- precision recall f1-score support
95
- 0 1.0 0.989988 0.994969 16480.0
96
- micro avg 1.0 0.989988 0.994969 16480.0
97
- macro avg 1.0 0.989988 0.994969 16480.0
98
- weighted avg 1.0 0.989988 0.994969 16480.0
99
- ```
100
-
101
- #### IoU Threshold: iou_0.60
102
-
103
- - **Accuracy:** 0.989
104
- - **Balanced Accuracy:** 0.989
105
- - **Macro Precision:** 1.000
106
- - **Macro Recall:** 0.989
107
- - **Macro F1 Score:** 0.994
108
- - **Cohen's Kappa:** 0.000
109
- - **Matthews Corrcoef:** 0.000
110
-
111
- ##### Confusion Matrix
112
-
113
- ```
114
- Predicted Label 0
115
- True Label
116
- 0 16294
117
- ```
118
-
119
- ##### Classification Report
120
-
121
- ```
122
- precision recall f1-score support
123
- 0 1.0 0.988714 0.994325 16480.0
124
- micro avg 1.0 0.988714 0.994325 16480.0
125
- macro avg 1.0 0.988714 0.994325 16480.0
126
- weighted avg 1.0 0.988714 0.994325 16480.0
127
- ```
128
-
129
- #### IoU Threshold: iou_0.65
130
-
131
- - **Accuracy:** 0.987
132
- - **Balanced Accuracy:** 0.987
133
- - **Macro Precision:** 1.000
134
- - **Macro Recall:** 0.987
135
- - **Macro F1 Score:** 0.994
136
- - **Cohen's Kappa:** 0.000
137
- - **Matthews Corrcoef:** 0.000
138
-
139
- ##### Confusion Matrix
140
-
141
- ```
142
- Predicted Label 0
143
- True Label
144
- 0 16269
145
- ```
146
-
147
- ##### Classification Report
148
-
149
- ```
150
- precision recall f1-score support
151
- 0 1.0 0.987197 0.993557 16480.0
152
- micro avg 1.0 0.987197 0.993557 16480.0
153
- macro avg 1.0 0.987197 0.993557 16480.0
154
- weighted avg 1.0 0.987197 0.993557 16480.0
155
- ```
156
-
157
- #### IoU Threshold: iou_0.70
158
-
159
- - **Accuracy:** 0.986
160
- - **Balanced Accuracy:** 0.986
161
- - **Macro Precision:** 1.000
162
- - **Macro Recall:** 0.986
163
- - **Macro F1 Score:** 0.993
164
- - **Cohen's Kappa:** 0.000
165
- - **Matthews Corrcoef:** 0.000
166
-
167
- ##### Confusion Matrix
168
-
169
- ```
170
- Predicted Label 0
171
- True Label
172
- 0 16251
173
- ```
174
-
175
- ##### Classification Report
176
-
177
- ```
178
- precision recall f1-score support
179
- 0 1.0 0.986104 0.993004 16480.0
180
- micro avg 1.0 0.986104 0.993004 16480.0
181
- macro avg 1.0 0.986104 0.993004 16480.0
182
- weighted avg 1.0 0.986104 0.993004 16480.0
183
- ```
184
-
185
- #### IoU Threshold: iou_0.75
186
-
187
- - **Accuracy:** 0.984
188
- - **Balanced Accuracy:** 0.984
189
- - **Macro Precision:** 1.000
190
- - **Macro Recall:** 0.984
191
- - **Macro F1 Score:** 0.992
192
- - **Cohen's Kappa:** 0.000
193
- - **Matthews Corrcoef:** 0.000
194
-
195
- ##### Confusion Matrix
196
-
197
- ```
198
- Predicted Label 0
199
- True Label
200
- 0 16209
201
- ```
202
-
203
- ##### Classification Report
204
-
205
- ```
206
- precision recall f1-score support
207
- 0 1.0 0.983556 0.99171 16480.0
208
- micro avg 1.0 0.983556 0.99171 16480.0
209
- macro avg 1.0 0.983556 0.99171 16480.0
210
- weighted avg 1.0 0.983556 0.99171 16480.0
211
- ```
212
-
213
- #### IoU Threshold: iou_0.80
214
-
215
- - **Accuracy:** 0.980
216
- - **Balanced Accuracy:** 0.980
217
- - **Macro Precision:** 1.000
218
- - **Macro Recall:** 0.980
219
- - **Macro F1 Score:** 0.990
220
- - **Cohen's Kappa:** 0.000
221
- - **Matthews Corrcoef:** 0.000
222
-
223
- ##### Confusion Matrix
224
-
225
- ```
226
- Predicted Label 0
227
- True Label
228
- 0 16153
229
- ```
230
-
231
- ##### Classification Report
232
-
233
- ```
234
- precision recall f1-score support
235
- 0 1.0 0.980158 0.989979 16480.0
236
- micro avg 1.0 0.980158 0.989979 16480.0
237
- macro avg 1.0 0.980158 0.989979 16480.0
238
- weighted avg 1.0 0.980158 0.989979 16480.0
239
- ```
240
-
241
- #### IoU Threshold: iou_0.85
242
-
243
- - **Accuracy:** 0.973
244
- - **Balanced Accuracy:** 0.973
245
- - **Macro Precision:** 1.000
246
- - **Macro Recall:** 0.973
247
- - **Macro F1 Score:** 0.987
248
- - **Cohen's Kappa:** 0.000
249
- - **Matthews Corrcoef:** 0.000
250
-
251
- ##### Confusion Matrix
252
-
253
- ```
254
- Predicted Label 0
255
- True Label
256
- 0 16042
257
- ```
258
-
259
- ##### Classification Report
260
-
261
- ```
262
- precision recall f1-score support
263
- 0 1.0 0.973422 0.986532 16480.0
264
- micro avg 1.0 0.973422 0.986532 16480.0
265
- macro avg 1.0 0.973422 0.986532 16480.0
266
- weighted avg 1.0 0.973422 0.986532 16480.0
267
- ```
268
-
269
- #### IoU Threshold: iou_0.90
270
-
271
- - **Accuracy:** 0.950
272
- - **Balanced Accuracy:** 0.950
273
- - **Macro Precision:** 1.000
274
- - **Macro Recall:** 0.950
275
- - **Macro F1 Score:** 0.974
276
- - **Cohen's Kappa:** 0.000
277
- - **Matthews Corrcoef:** 0.000
278
-
279
- ##### Confusion Matrix
280
 
281
- ```
282
- Predicted Label 0
283
- True Label
284
- 0 15648
285
- ```
286
 
287
- ##### Classification Report
288
 
289
- ```
290
- precision recall f1-score support
291
- 0 1.0 0.949515 0.974104 16480.0
292
- micro avg 1.0 0.949515 0.974104 16480.0
293
- macro avg 1.0 0.949515 0.974104 16480.0
294
- weighted avg 1.0 0.949515 0.974104 16480.0
295
- ```
296
 
297
- #### IoU Threshold: iou_0.95
298
 
299
- - **Accuracy:** 0.619
300
- - **Balanced Accuracy:** 0.619
301
- - **Macro Precision:** 1.000
302
- - **Macro Recall:** 0.619
303
- - **Macro F1 Score:** 0.764
304
- - **Cohen's Kappa:** 0.000
305
- - **Matthews Corrcoef:** 0.000
306
 
307
- ##### Confusion Matrix
 
308
 
309
- ```
310
- Predicted Label 0
311
- True Label
312
- 0 10196
313
- ```
314
 
315
- ##### Classification Report
316
 
317
- ```
318
- precision recall f1-score support
319
- 0 1.0 0.618689 0.764432 16480.0
320
- micro avg 1.0 0.618689 0.764432 16480.0
321
- macro avg 1.0 0.618689 0.764432 16480.0
322
- weighted avg 1.0 0.618689 0.764432 16480.0
323
- ```
324
 
325
- ## Embedding Quality
 
326
 
327
- ### Internal Cluster Validation
 
 
 
 
328
 
329
- | Silhouette_Score | Davies-Bouldin_Index | Calinski-Harabasz_Index |
330
- |-------------------:|-----------------------:|--------------------------:|
331
- | 0.659463 | 0.337135 | 44803.3 |
332
 
333
- ### External Cluster Validation
 
 
 
 
334
 
335
- | ARI | NMI | Cluster_Purity |
336
- |-----------:|----------:|-----------------:|
337
- | 0.00523866 | 0.0704749 | 0.0848908 |
338
 
339
- ### Mantel Correlation
 
340
 
341
- | r | p_value | n_items |
342
- |-----------:|----------:|----------:|
343
- | -0.0414017 | 0.784 | 32 |
344
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ # Model Card: yolov11x_bb_detect_model
5
 
6
+ ## Model Details
7
+ - **Model Name:** `yolov11x_bb_detect_model`
8
+ - **Model Type:** Single-Class Object Detection and Feature Extractor
9
+ - **Description:** This model is designed to detect the presence of bark beetles in images. It identifies and places a bounding box around the target but does not classify different species of bark beetles. It operates under the single class label: **'bark_beetle'**.
10
 
11
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
+ ## Evaluation Datasets
 
 
 
 
14
 
15
+ To understand the model's capabilities, its performance was tested on two different types of datasets:
16
 
17
+ - **In-Distribution (ID):** This dataset contains images that are **similar to the data the model was trained on**. Performance on this dataset shows how well the model performs on familiar types of images.
18
+ - **Out-of-Distribution (OOD):** This dataset contains images that are **intentionally different species from the training data**.
 
 
 
 
 
19
 
20
+ ---
21
 
22
+ ## Performance
 
 
 
 
 
 
23
 
24
+ ### Object Detection
25
+ The model's ability to correctly identify and locate bark beetles is measured by its **mean Average Precision (mAP)**. This metric evaluates both the accuracy of the bounding box placement and the classification confidence. The score is averaged over multiple Intersection over Union (IoU) thresholds, from 50% overlap (`0.50`) to 95% overlap (`0.95`), providing a comprehensive view of prediction accuracy. A higher mAP score indicates better performance.
26
 
27
+ | Dataset | mAP (0.50 : 0.95) | Notes |
28
+ | :--- | :--- | :--- |
29
+ | **In-Distribution (ID)** | 🟩 0.9485 | Shows excellent detection performance on images similar to its training data. |
30
+ | **Out-of-Distribution (OOD)**| 🟦 0.9271 | Retains strong performance on novel species, indicating good generalization. |
 
31
 
32
+ <br>
33
 
34
+ ### Feature Extraction (Embedding Performance)
35
+ The model can also convert images into numerical representations (embeddings). The quality of these embeddings is evaluated by how well they group similar species together in a feature space.
 
 
 
 
 
36
 
37
+ #### Internal Cluster Validation
38
+ These metrics measure the quality of the clusters formed by the embeddings without referring to ground-truth labels. They assess how dense and well-separated the clusters are.
39
 
40
+ | Metric | ID Score | OOD Score | Interpretation |
41
+ | :--- | :--- | :--- | :--- |
42
+ | **Silhouette Score** | 0.6000 | 0.4412 | Measures how similar an object is to its own cluster compared to others. **Higher is better (closer to 1)**. The ID embeddings form better-defined clusters. |
43
+ | **Davies-Bouldin Index**| 0.3823 | 0.2859 | Measures the average similarity between each cluster and its most similar one. **Lower is better (closer to 0)**. The OOD embeddings show less overlap between clusters. |
44
+ | **Calinski-Harabasz Index**| 1504.67 | 824.437 | Measures the ratio of between-cluster dispersion to within-cluster dispersion. **Higher is better**. The ID embeddings form denser and more separated clusters. |
45
 
46
+ #### External Cluster Validation
47
+ These metrics evaluate the clustering performance by comparing the results to the true species labels.
 
48
 
49
+ | Metric | ID Score | OOD Score | Interpretation |
50
+ | :--- | :--- | :--- | :--- |
51
+ | **Adjusted Rand Index (ARI)** | 0.1131 | 0.0049 | Measures the similarity between true and predicted labels, correcting for chance. **Higher is better (closer to 1)**. |
52
+ | **Normalized Mutual Info (NMI)** | 0.4576 | 0.2666 | Measures the agreement between the clustering and the true labels. **Higher is better (closer to 1)**. |
53
+ | **Cluster Purity** | 0.3051 | 0.1249 | Measures the extent to which clusters contain a single class. **Higher is better (closer to 1)**. |
54
 
55
+ **Conclusion:** The external validation scores are low for both datasets, indicating the model's feature representations do **not** effectively separate different species of bark beetles on their own.
 
 
56
 
57
+ #### Phylogenetic Correlation (Mantel Test)
58
+ This test determines if the model's learned features correlate with the evolutionary relationships (phylogeny) between different bark beetle species.
59
 
60
+ - **Mantel R-statistic:** This value ranges from -1 to 1. A positive value means species that are close in the model's feature space are also close evolutionarily. A value near zero indicates no correlation.
61
+ - **p-value:** Indicates the statistical significance of the result. A p-value below 0.05 typically suggests a significant correlation.
 
62
 
63
+ | Dataset | Mantel R-statistic | p-value | Interpretation |
64
+ | :--- | :--- | :--- | :--- |
65
+ | **In-Distribution (ID)** | 0.0451 | 0.6860 | There is **no statistically significant correlation** between the model's feature embeddings and the species' evolutionary history. |
66
+ | **Out-of-Distribution (OOD)**| 0.0631 | 0.4460 | There is **no statistically significant correlation** for the OOD data either. |