Flower Color Predictor using AutoGluon
This repository contains a trained TabularPredictor from the AutoGluon library, which was trained to classify flower colors based on their physical dimensions.
Dataset
The model was trained on the scottymcgee/flowers dataset, using the synthetic (augmented) split for training and the original (original) split for final evaluation.
Evaluation Results
The final performance of the best model on the original dataset is as follows:
- Accuracy:
1.0000 - Weighted F1:
1.0000
Files in this Repository
autogluon_predictor.pkl: The trainedTabularPredictorpickled usingcloudpickle.autogluon_predictor_dir.zip: The zipped native AutoGluon predictor directory for portability.
Potential Errors
Based on the accuracy being so high, I assume there may be data leakage. Since the augmented data was created directly from the original data (by adding noise or small variations), the model wasn't learning to generalize to new information. It was simply memorizing the patterns it had already been shown. This could have led to overfitting, where a model learns the training data so well that it fails to perform on new, unseen data.