calculate the predicted value based on coefficient and constant in python

97 Views Asked by Mostafa Bouzari At 14 September 2023 at 19:08

i have the coefficients and the constant (alpha). i want to multiply and add the values together like this example. (it has to be done for 300000 rows)

Prediction = constant + (valOfRow1 * col1) + (-valOfRow1 * col2) + (-valOfRow1 * col3) + (valOfRow1 * col4) + (valOfRow1 * col5)

Prediction = 222 + (555-07 * col1) + (-555-07 * col2) + (-66* col3) + (55* col4) + (777* col5)

i have a one row dataframe which contains the coefficient and constant like this

	col1	col2	col3	col4	col5	constant
	2.447697e-07	-5.214072e-07	-0.000003	0.000006	555	222

and another dataframe with the exact same name but with monthly values.

col1	col2	col3	col4	col5
16711	17961	0	20	55

i already tried to sort the columns and then i take the product of them df.dot.

selected_columns = selected_columns.sort_index(axis=1)
#mean_coefficients dataframe 21th (starting from 0) is constant so i use the other columns
selected_columns['predicted_Mcap']=selected_columns.dot(mean_coefficients.iloc[:,0:20])+mean_coefficients['const']

the reason that i use mean_coefficients.iloc[:,0:20] is because i don't want to conclude const in the multiplication it just needs to be added at the end.

so i calculated the predicted value but when i checked it in excel the value wasn't the same.

am i calculating it right?

Original Q&A

There are 2 best solutions below

Mostafa Bouzari On 15 September 2023 at 08:40 BEST ANSWER

As mentions in df.dot() documentation the column names of DataFrame and the index of other must contain the same values, as they will be aligned prior to the multiplication. Otherwise you'll get

ValueError: matrices are not aligned

so you have 2 Options:

to use the df.dot() with the .T or transposed dataframe. Your column names will be as indexes and is ready to be multiplied in a matrix way. Remember that the Column names in both dataframes has to be the same. Even one extra column returns error.

selected_columns['predicted_MCAP']=selected_columns.dot(mean_coefficients.iloc[:,1:21].T) + mean_coefficients['const']

in order to workaround this i by using numpy array

result = df1.dot(df2.values)

Rodrigo Lucchesi On 14 September 2023 at 20:35

Check if this method can solve your task:

import pandas as pd

# Load the coefficients and variables data frames
df_coefficients = pd.read_clipboard()
df_variables = pd.read_clipboard()


def predict(df_coefficients: pd.DataFrame, df_variables: pd.DataFrame) -> pd.Series:
    """
    Predicts the value of the dependent variable based on the values of the independent variables.
    :param df_coefficients: DataFrame with the coefficients of the independent variables.
    :param df_variables: DataFrame with the values of the independent variables.
    :return: Series with the predicted values of the dependent variable.
    """
    result = []
    # Convert the constants to a pandas Series and remove them from the coefficients DataFrame
    constants = df_coefficients.iloc[:]['constant']
    df_coefficients.drop(['constant'], inplace=True, axis=1)

    # Iterate over the rows of the coefficients DataFrame and calculate the prediction
    for idx, val in constants.items():
        prediction: float = val + (df_coefficients.iloc[idx][:] * df_variables.iloc[idx][:]).sum()
        print(f'prediction {idx}: {prediction}')
        result.append(prediction)
    return pd.DataFrame({'prediction': result})


result = predict(
    df_coefficients=df_coefficients, 
    df_variables=df_variables
)
result

prediction: 30746.99484535174

Best!

calculate the predicted value based on coefficient and constant in python

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in REGRESSION

Related Questions in PREDICTION

Related Questions in COEFFICIENTS

Trending Questions

Popular # Hahtags

Popular Questions