Skip to content

Conversation

@lizhuoq
Copy link

@lizhuoq lizhuoq commented May 5, 2024

Why are these changes needed?

For original multi-output tasks where the eval_method is holdout, manual setting of the validation set was not possible. This commit introduces a new feature allowing manual setting of the validation set for multi-output tasks.

model = MultiOutputRegressor(
    AutoML(
        task="regression",
        time_budget=1,
        eval_method="holdout",
        multioutput_train_size=len(X_train)
    )
)
model.fit(
    pd.concat([X_train, X_val]),
    pd.concat([y_train, y_val])
)

Related issue number

Checks

@lizhuoq
Copy link
Author

lizhuoq commented May 5, 2024

@microsoft-github-policy-service agree

Copy link
Contributor

@prdai prdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for manually setting a validation set for multi-output tasks when using the "holdout" evaluation method. Previously, users could not manually specify a validation set for multi-output regression tasks. The new multioutput_train_size parameter allows users to concatenate training and validation data and specify where to split them.

Changes:

  • Added multioutput_train_size parameter to AutoML class for manual validation set specification
  • Implemented _train_val_split method to split concatenated training/validation data
  • Added test case demonstrating the new functionality with MultiOutputRegressor

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
flaml/automl/automl.py Added documentation and implementation for the multioutput_train_size parameter, including the split logic in the fit method
test/automl/test_regression.py Added test_multioutput_train_size function to demonstrate usage of the new feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 233 to 253
def test_multioutput_train_size():
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor, RegressorChain

# create regression data
X, y = make_regression(n_targets=3)

# split into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1, random_state=42)

# train the model
model = MultiOutputRegressor(
AutoML(task="regression", time_budget=1, eval_method="holdout", multioutput_train_size=len(X_train))
)
model.fit(np.concatenate([X_train, X_val], axis=0), np.concatenate([y_train, y_val], axis=0))

# predict
print(model.predict(X_test))
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test function lacks assertions to verify the new multioutput_train_size feature works as expected. Consider adding assertions to validate that the model was trained successfully and that the validation split was performed correctly. For example, you could check that the model produces reasonable predictions or verify internal state that confirms the train/validation split occurred.

Copilot uses AI. Check for mistakes.
thinkall and others added 3 commits January 20, 2026 21:54
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants