Database Reference
In-Depth Information
The design, implementation and testing of classification and clustering
techniques that produce non-discriminatory models of the data by design;
i.e., the techniques will be constrained in such a way that they can only
produce models that are non-discriminatory. These techniques will offer
a safe alternative to the current techniques which only offer a false
comfort of providing unbiased solutions.
Based upon research in the previous two topics, we may discover
situations and new mechanisms through which discrimination can
take place. These discoveries can be profitably used to update current
legislation.
In this topic the first two orthogonal directions are discussed. The first direction
concerns the detection of discrimination in a dataset, whereas the second concerns
avoiding discrimination. For both directions, the development of adequate
technological solutions is a necessity to implement discrimination control in
practice. As detailed in Chapter 3, the application of data mining techniques may
lead to subtle forms of indirect discrimination, even if there was no direct
intention to discriminate. As data mining is a very active research domain
continuously being further developed, there will be an increasing need for
sophisticated discrimination detection techniques. The situation can be compared
to that of spam email; without proper spam filters and techniques to investigate the
originator of spam, legislation outlawing spam has little power. But even with
spam filters, spam detection remains a moving target. As soon as new detection
techniques are developed, spammers change their strategy to fool the filters. A
similar race can be expected in discrimination detection; legislation alone will not
suffice to stop discriminatory practices in large scale profiling by companies or
governmental institutions. As discrimination detection techniques will improve,
profiling software will exploit increasingly more subtle and ingenious ways to
circumvent restrictions imposed by technological solutions. This will transfer, not
for the purpose of discriminating per se, but for reasons of predictive accuracy or
efficiency; as long as sensitive attributes such as ethnicity serve as a proxy and
indirectly provide otherwise inaccessible information relevant for the profiling
task it will be interesting to use sensitive information either directly or indirectly.
In Chapter 5 several techniques to detect discrimination in decision-making
records were proposed. The chapter sketches the idea of a discrimination “audit”,
aimed at post-factum identification of discriminative context. Such audits will
be important in discrimination law enforcement. As explained in Chapter 8,
however, in practical applications it will be very hard to assess which forms of
discrimination results from an acceptable use of informative attributes, and what
part represents unjustified or illegal discrimination, i.e., discrimination that cannot
be justified by objective arguments or supported by a legal basis.
In addition to the discrimination detecting technology in post-factum data, tools
have to be developed so to allow for learning unbiased models. Several of such
techniques are detailed in this topic; for instance, in Chapters 12, 13, and 14. None
of these techniques, however, can guarantee that the model built will stand the test
of judicial trial. Unlike some of the technological solutions in anonymization and
Search WWH ::




Custom Search