Database Reference
In-Depth Information
discrimination depends on both the actual class label, and on the sensitive value,
but who is being discriminated is decided at random, i.e., independent of the other
attribute values. We now show how to find likely latent class labels, i.e., how to
discover who is likely being discriminated.
Finding likely latent values
We need to find good values to assign to the latent attribute in every row from the
data-set. Essentially, this is a problem of finding two groups (or clusters) of rows:
the ones that should have gotten a positive label, and those that should have gotten
a negative label. We now briefly describe the standard approach of expectation
maximization (EM) that is commonly used in order to find such clusters. The read-
er is referred to (Bishop, 2006) for a more detailed description of this algorithm.
Given a model M with a latent attribute L, the goal of the expectation maximiza-
tion algorithm is to set the parameters of M such that they maximize the likelihood
of the data-set, i.e., the probability of the data-set given the model. Unfortunately,
since L is unobserved, the parameters involving L can be set in many different
ways. Searching all of these settings for the most optimal one is a hopeless task. In-
stead, expectation maximization optimizes these settings by fitting them to the da-
ta-set (the M-step), then calculates the expected values of the latent attribute given
those settings (the E-step), incorporates these back into the data-set, and iterates.
This is a greedy procedure that converges to a local optimum of the likelihood
function. Typically, random restarts are applied (randomizing the initial values of
the latent variable) in order to find better latent values.
Using prior information
For the problem of finding the actual discrimination-free class labels we can do a
lot better than simply running EM and hoping that the found solution corresponds
to discrimination-free labels. For starters, it makes no sense to modify the labels of
rows with favored sensitive values and negative class labels. The same holds for
rows with discriminated sensitive values and positive class labels. Modifying
these can only result in more discrimination, so we fix the latent values of these
rows to be identical to the class labels in the data-set and remove them from the
E-step of the EM algorithm.
Another improvement over blindly applying EM is to incorporate prior know-
ledge of the distribution P(C | L, S). In fact, since the ultimate goal is to achieve
zero discrimination, we can pre-compute this entire distribution. We show how to
do this using an example.
Example
Suppose we have a data-set consisting of 100 rows of people, distributed accord-
ing to the following occurrence counts:
Low income
High income
Female
30
20
Male
10
40
 
Search WWH ::




Custom Search