I'm trying to make a forecast for multiple time-series, for this purpose I have a Prophet function to make a prediction on a certain date in future.
When I try using multiprocessing it shows me this: DataFrame constructor not properly called!
I noticed that it's because I'm not making a "future_dataframe(periods=…)"
How can I make it work?
My goal is to count a difference in % between two dates, separated by one month, or 2 months, etc. With for-loop I can count an average (in groups) difference in % between two dates. I'm curious, is it possible to do it using multiprocessing
def train_and_forecast(group):
prevmonth=datetime.date(2021, 9, 1)
timedelta=relativedelta(months=6)
# Initiate the model
m = Prophet(weekly_seasonality=False, yearly_seasonality=False, changepoint_range=0.8,changepoint_prior_scale=0.8)
if group.shape[0] > 2:
# Fit the model
m.fit(group)
# Make predictions
future_date = pd.DataFrame(prevmonth+timedelta)
#future = m.make_future_dataframe(periods=30)
global forecast
forecast = m.predict(future_date)[['ds', 'yhat', 'yhat_lower', 'yhat_upper']]
#forecast = m.predict(future)[['ds', 'yhat', 'yhat_lower', 'yhat_upper']]
forecast['ticker'] = group['ticker'].iloc[0]
# Return the forecasted results
return forecast[['ds', 'ticker', 'yhat', 'yhat_upper', 'yhat_lower']], m
#def MultiProcessing(prev):
from multiprocessing import Pool, cpu_count
# Process bar
from tqdm import tqdm
# Start time
start_time = time()# Get time series data for each ticker and save in a list
series = [groups_by_ticker.get_group(ticker) for ticker in ticker_list]# Create a pool process with the number of worker processes being the number of CPUs
#multiprocess_forecast=pd.DataFrame()
p = Pool(cpu_count())# Make predictions for each ticker and save the results to a list
timedelta=relativedelta(months=6)
predictions = list(tqdm(p.imap(train_and_forecast, series), total=len(series))) #Terminate the pool process
# DATE
p.close()# Tell the pool to wait till all the jobs are finished before exit
p.join()# Concatenate results
#multiprocess_forecast.append(predictions) #= pd.concat(predictions)# Get the time used for the forecast
testtime=time()-start_time
print(testtime)
Instead of
future_date = pd.DataFrame(prevmonth+timedelta)create a future dataframe using themake_future_dataframe()function docs in Prophet like: