Skip to content

Add (built-in) multi-target support  #1301

@azmyrajab

Description

@azmyrajab

Hi!

Thank you for providing this useful AutoML package.

In several of the main tree models (e.g. XGBoost 2.0, CatBoost) the regressors support fitting against multiple targets (see here and here). With recent advances with multi_output_tree, these multi-output regression/classification models tend to be pretty useful.

Would it be possible to modify the AutoML class to support passing a 2-d "y" targets array-like input to .fit() for regression tasks ?

I think the implementation would require:

  • modifying asserts / code checks to accept a 2-d multi-targets
  • if multi-output input is detected: assert that the model list is supported (e.g. XGBoost & CatBoost work, whereas LightGBM does not yet have built-in support for it outside of wrapping with sklearn MultiOutput Regressor). Then modify default parameters to accomodate multi-target as needed, e.g. for CatBoost regression the default objective needs to be "MultiRMSE" instead of "RMSE". These are minor changes to default parametrization, if any.
  • ensure splitters split y, if 2D, correctly and pass as-is to the underlying model(s)
  • for cross-validate scores: define a default policy of e.g. averaging scores per target to be used as final tuning validation score

Doing this abstracts away implementation details of single vs multi-output for users and allows FLAML to "just work" for either

Thanks for considering,
Azmy

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions