How to iteratively add tuple elements to a dataframe as new columns?

44 Views Asked by At

I am using the statsmodels.stats.multitest.multipletests function

to correct p-values I have stored in a dataframe:

p_value_df = pd.DataFrame({"id": [123456, 456789], "p-value": [0.098, 0.05]})

for _, row in p_value_df.iterrows():
    p_value = row["p-value"]
    print(p_value)
    results = multi.multipletests(
        p_value,
        alpha=0.05,
        method="bonferroni",
        maxiter=1,
        is_sorted=False,
        returnsorted=False,
    )
    print(results)

which looks like: enter image description here

I would really like to add each of the elements of the tuple output as a new column in the p_value_df and am a bit stuck.

I've attempted to convert the results to a list and use zip(*tuples_converted_to_list) but as some of the values are floats this throws an error.

Additionally, I'd like to pull the array elements so that array([False]) is just False.

Can anyone make any recommendations on a strategy to do this?

1

There are 1 best solutions below

0
Timeless On BEST ANSWER

I would use a listcomp to make a nested list of the multitests, then pass it to the DataFrame constructor and finally join it with the original p_value_df :

import numpy as np
import statsmodels.stats.multitest as multi

def fn(pval):
    return multi.multipletests(
        pval, alpha=0.05, method="bonferroni",
        maxiter=1, is_sorted=False, returnsorted=False,
    )

l = [
    [e[0] if isinstance(e, np.ndarray) and e.size == 1 else e
     for e in fn(pval)] for pval in p_value_df["p-value"]
]


cols = ["reject", "pvals_corrected", "alphacSidak", "alphacBonf"]

out = p_value_df.join(pd.DataFrame(l, columns=cols))

Output :

print(out)

       id  p-value  reject  pvals_corrected  alphacSidak  alphacBonf
0  123456    0.098   False            0.098         0.05        0.05
1  456789    0.050    True            0.050         0.05        0.05