-
Notifications
You must be signed in to change notification settings - Fork 550
Description
Discussed in #1224
Originally posted by LeonardoEssence September 21, 2023
I was running some of the AutoML examples on the documentation here, and the code for all time series examples kept breaking at a pandas key error prompt. See below:
Traceback (most recent call last): File "/mnt/uni_variate_time_series_flaml.py", line 30, in <module> automl.fit(dataframe=train_df, # training data File "/opt/conda/lib/python3.9/site-packages/flaml/automl/automl.py", line 1663, in fit task.validate_data( File "/opt/conda/lib/python3.9/site-packages/flaml/automl/task/time_series_task.py", line 167, in validate_data data = TimeSeriesDataset( File "/opt/conda/lib/python3.9/site-packages/flaml/automl/time_series/ts_data.py", line 57, in __init__ self.frequency = pd.infer_freq(train_data[time_col].unique()) File "/opt/conda/lib/python3.9/site-packages/pandas/core/frame.py", line 3505, in __getitem__ indexer = self.columns.get_loc(key) File "/opt/conda/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc raise KeyError(key) from err KeyError: 'index'
I went deep into the code and found what I believe is a small bug in the class TimeSeriesTask, when calling the function TimeSeriesDataset in line 167 in the file time_series_task.py.
The function is expecting a data frame with train data and the time stamp vector, however, the code in line 165, is only concatenating Xt and yt, leaving out the time vector.
I propose to change line 165 from df_t = pd.concat([Xt, yt], axis=1) to df_t = pd.concat([pre_data.all_data[pre_data.time_col], Xt, yt], axis=1). That worked for me, however, I'm not 100% sure that's the intended functionality but as it is now, it is not working.
Is anybody finding the same? or can provide some suggestions?