I'm using the AI Fairness 360 package to get fairness metrics on a dataset. I've already converted the data to a StandardDataset instance. If I understand correctly, this will change all values of protected attributes to 1 or 0: 1 meaning "belongs the a privileged group for this attribute", and 0 meaning "belongs to an unprivileged group for this attribute".
When calculating fairness metrics, I need to create a BinaryLabelDatasetMetric instance for which I need to specify which combinations of protected attributes I consider my privileged/unprivileged groups. But why do I need to provide the attribute values that are privileged/unprivileged? After converting to a StandardDataset all privileged values are 1 and unprivileged are 0. Am I missing something? Because if not, just coding it as always 1 is much easier.
So in summary, my question is: can the values for protected attributes in a StandardDataset ever be anything other than 1 or 0? If yes, in what case? (If no, it seems the API could be simplified a lot, by just requiring the names of the protected attributes and not the values.)
Yes. They can also be other values in the original dataset, but they will be converted to 0 and 1 regardless once transformed by the
StandardDataset. From the source:You can also check out the example here to see that in action by altering the
genderattribute (e.g., replacing 0 and 1 with 'female' and 'male' right before passing toStandardDataset).This seems unnecessary if the privileged class is already coded as 1 and unprivileged as 0. But if this is not the case, setting such a requirement would mean a user needs to manipulate the original dataset, which is not desirable.