Shoe Length Regression Model - LightGBMXT

This model is the LightGBMXT model trained by AutoGluon, which showed strong performance on the test set (df_orig) in our experiment predicting shoe length.

Task Description

This is a regression task that predicts the Actual Measured Shoe Length (Actual_length) based on various shoe features. The features used include numerical values (US size, Shoe size (mm)) and encoded text features (type, color, and brand embeddings).

Dataset

The model was trained and evaluated using the maryzhang/hw1-24679-tabular-dataset from the Hugging Face Datasets library. Specifically, the augmented split (augmented_ds) was used for training and validation (via AutoGluon's internal splitting), and the original split (original_ds) was used for final testing/leaderboard evaluation.

Evaluation

The primary evaluation metric for this regression task was Root Mean Squared Error (RMSE). Lower RMSE values indicate better predictive accuracy.

Based on the AutoGluon leaderboard from our training run:

The LightGBMXT model achieved a test score of -0.579697 on the original_ds (corresponding to an RMSE of approximately 0.580 mm). This score represents the model's performance on unseen data.
AutoGluon's internally selected best model (based on validation performance, typically a WeightedEnsemble_L2) might have a different validation score, but you chose to upload the specific LightGBMXT model based on its test set performance.

Model Details

Model Type: LightGBM (trained via AutoGluon Tabular)
Libraries Used: AutoGluon, LightGBM, Sentence-Transformers, scikit-learn, pandas, numpy, datasets, matplotlib
Training Environment: Google Colab

Usage

This repository contains the saved files for the LightGBMXT model from an AutoGluon TabularPredictor training run.

To load and use this specific model (assuming you have AutoGluon installed):

# You would typically load the entire predictor to use AutoGluon's best model selection,
# but if you specifically need this LightGBMXT model:
# You might need to load the full predictor directory first and then access the specific model.

# Example (requires AutoGluon and potentially loading the full predictor first):
# from autogluon.tabular import TabularPredictor
# predictor_path = "/path/where/you/downloaded/this/repo" # Point to the downloaded repo directory
# predictor = TabularPredictor.load(predictor_path)
# lightgbmxt_model = predictor.get_model('LightGBMXT') # Access the specific model
# predictions = lightgbmxt_model.predict(your_new_data)

# A simpler approach might be to download the full predictor directory and load it
# predictor = TabularPredictor.load("kaitongg/shoe-length-predictor-lightgbmxt") # If the full predictor was uploaded
# predictions = predictor.predict(your_new_data)

# Note: Loading just a single model like this might require careful handling of preprocessing
# and feature engineering steps that were part of the original AutoGluon pipeline.
# Loading the full predictor is generally recommended if it was uploaded.

Downloads last month: -; Downloads are not tracked for this model. How to track

kaitongg
/

shoe-length-predictor-lightgbmxt