Support of SMOTE with cross-validation

Hello, congrats for the project. It is purely awesome.

I have a small enhancement to suggest:
Sometimes we have to deal with imbalanced datasets and we employ "Synthetic Minority Oversampling Techniques" (SMOTE).
However, oversampling needs to **only be done in the training set** and not in the validation/test sets.
Thus, in a cross-validation scenario, at each split SMOTE needs to be done only on the train split and not on validation split. 

This functionality is not currently supported by FLAML.
I would suggest the SMOTE over-sampler object be passed in the automl_settings and then used at each split.

The imbalanced-learn package achieves the same functionality using custom sklearn-like Pipelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support of SMOTE with cross-validation #1200

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support of SMOTE with cross-validation #1200

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions