Please is there a python function to identify a boolean in large dataset ? with 30+ column ?
The beneficiary summary file has several chronic illness columns for each member. These are Boolean fields. 1)Convert these columns into a single categorical variable, concatenating multiple True diagnoses. 2)If a member has 3 or more chronic conditions, categorise these as “Multiple”
This is the link to the data set
This is the several chronic illness columns
SP_ALZHDMTA
SP_CHF
SP_CHRNKIDN
SP_CNCR
SP_COPD
SP_DEPRESSN
SP_DIABETES
SP_ISCHMCHT
SP_OSTEOPRS
SP_RA_OA
SP_STRKETIA
I'm assuming that the value 2 corresponds to having the illness and 1 otherwise. The boolean values of all illnesses can be concatenated into a single column by assigning a unique bit position to each illness. You can then sort of "toggle" these bits depending on whether or not a given row has these illnesses. These bits are then concatenated using the bitwise OR (
|) operator. Meanwhile, you can keep count of the number of illnesses for each row in a separate column.In total, there are 2^11 = 2048 categories for the illnesses, each value is of an integer ranging from 0-2047