Skip to content

Support of SMOTE with cross-validation #1200

@petrosDemetrakopoulos

Description

@petrosDemetrakopoulos

Hello, congrats for the project. It is purely awesome.

I have a small enhancement to suggest:
Sometimes we have to deal with imbalanced datasets and we employ "Synthetic Minority Oversampling Techniques" (SMOTE).
However, oversampling needs to only be done in the training set and not in the validation/test sets.
Thus, in a cross-validation scenario, at each split SMOTE needs to be done only on the train split and not on validation split.

This functionality is not currently supported by FLAML.
I would suggest the SMOTE over-sampler object be passed in the automl_settings and then used at each split.

The imbalanced-learn package achieves the same functionality using custom sklearn-like Pipelines.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions