Skip to content

Conversation

@Qiaochu-Song
Copy link
Collaborator

  • Add an estimator for TextPredictor.
  • Add a test for TextPredictor estimator.

Copy link
Contributor

@liususan091219 liususan091219 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the training data passed through kwargs? It’s supposed to be passed from X_train

Copy link
Contributor

@liususan091219 liususan091219 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename your estimator to MultiModalEstimator

flaml/ml.py Outdated
ARIMA,
SARIMAX,
TransformersEstimator,
AGTextPredictorEstimator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AGTextPredictorEstimator,
MultiModalEstimator,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update all occurrences

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the commit.

flaml/model.py Outdated
from autogluon.text import TextPredictor

super().__init__(task, **params)
self.estimator_class = TextPredictor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove this and initialize the model with TextPredictor instead. Is that better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

flaml/model.py Outdated
}
return search_space_dict

def _init_fix_args(self, automl_fit_kwargs: dict=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this function? Can we simply remove it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have AGArgs dataclass in utils, and just use the default settings, we can remove this function, and just have self.ag_args=AGArgs() in MultimodalEstimator.fit(). Does it make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can implement this, and define a similar init_hf_args if you need to check user input validity.

score = automl.model.estimator.evaluate(test_dataset)
print(f"Inference on test set complete, {metric}: {score}")
del automl
gc.collect() No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a breakline to the end

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"gpu_per_trial": 0,
"max_iter": 2,
"time_budget": 50,
"task": "mm_multi",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename mm_multi -> multimodal-classification

flaml/model.py Outdated
# train_data = self._kwargs["train_data"]
import pandas as pd
train_data = pd.concat([X_train, y_train], axis=1)
tuning_data = pd.concat([X_train, y_train], axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean X_val, y_val?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this line since the tuning data is not necessary anymore.

flaml/model.py Outdated

self.fix_args = fix_args

def _init_hp_config(self, text_backbone: str, multimodal_fusion_strategy: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define cfg by defining a function inside of flaml/nlp/utils.py:class AGArgs, the remove this function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This _init_hp_config is to use the AGArgs and the self.params to get the hyperparametersdiction for the TextPredictor. If removed, still need to assemble this diction inside the MultimodalEstimator.fit(). Do you think it is better without this function and have this part inside the .fit()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this function to a function inside of AGArgs because AGArgs is for managing the config for AG.

flaml/data.py Outdated
)
SEQREGRESSION = "seq-regression"
REGRESSION = ("regression", SEQREGRESSION)
REGRESSION = ("regression", "mm_regression", SEQREGRESSION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename "mm_regression" -> "multimodal-regression", define a static variable for it

flaml/data.py Outdated
SEQCLASSIFICATION,
MULTICHOICECLASSIFICATION,
TOKENCLASSIFICATION,
"mm_multi",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you automatically detect "mm_multi" and "mm_binary" so we don't need these two values anymore?

flaml/model.py Outdated

# train_data = self._kwargs["train_data"]
import pandas as pd
train_data = pd.concat([X_train, y_train], axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use estimator._join method. See TransformersEstimator._join

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

flaml/ml.py Outdated
ARIMA,
SARIMAX,
TransformersEstimator,
AGTextPredictorEstimator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the commit.

flaml/model.py Outdated
from autogluon.text import TextPredictor

super().__init__(task, **params)
self.estimator_class = TextPredictor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

flaml/model.py Outdated
}
return search_space_dict

def _init_fix_args(self, automl_fit_kwargs: dict=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can implement this, and define a similar init_hf_args if you need to check user input validity.

flaml/model.py Outdated

self.fix_args = fix_args

def _init_hp_config(self, text_backbone: str, multimodal_fusion_strategy: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this function to a function inside of AGArgs because AGArgs is for managing the config for AG.

flaml/model.py Outdated
save_dir = self.fix_args["output_dir"]
label_column = self.fix_args["label_column"]
dataset_name = self.fix_args["dataset_name"]
ag_model_save_dir = os.path.join(save_dir, f"{dataset_name}_ag_text_multimodal_{text_backbone}\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Can you use the original directory save_dir instead of the modified directory ag_model_save_dir so users know where to find the saved model?

@Qiaochu-Song Qiaochu-Song changed the title MxTextPredictor TextPredictor May 20, 2022
@thinkall thinkall added the wontfix This will not be worked on label Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wontfix This will not be worked on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants