Get a column value five days after a specific date

32 Views Asked by At

I am building a logistic regression model, and for the target variable, I would like to get the Close price 5 days after an event has occurred.

import simfin as sf

# Load historical data
sf.set_data_dir('simfin_data')
sf.set_api_key(api_key='')

prices = sf.load(dataset='shareprices', variant='daily', market='us')
df = prices.loc[prices['Ticker'] == 'TSLA']
df.index = pd.to_datetime(df['Date'])
df = df[['High', 'Low', 'Open', 'Close', 'Volume']]


# Set Target Variable
load_target = sf.load(dataset='income', variant='quarterly', market='us')
load_target = load_target.loc[load_target['Ticker'] == 'TSLA']
event_close = df.loc[load_target['Publish Date'], 'Close']

# Get the Close price 5 days after 'Publish Date'
df['target'] = 
1

There are 1 best solutions below

1
AyyBeeShafi On
# Get the Close price 5 days after 'Publish Date'
df['target'] = df['Close'].shift(-5)

# Align the 'target' values with the corresponding 'Publish Date'
df['target'] = df['target'].loc[load_target['Publish Date']]

# Drop rows where 'Publish Date' is NaN (not present in share prices dataset)
df = df.dropna(subset=['target'])

Note that shifting the 'Close' prices by 5 rows might result in NaN values for the last 5 rows in the 'target' column. You can drop or fill these NaN values based on your preference and the requirements of your logistic regression model.