ACM Home Page
Please provide us with feedback. Feedback
Discrimination-aware data mining
Full text PdfPdf (267 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Research papers table of contents
Pages 560-568  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Dino Pedreshi  Università di Pisa, Pisa, Italy
Salvatore Ruggieri  Università di Pisa, Pisa, Italy
Franco Turini  Università di Pisa, Pisa, Italy
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 54,   Downloads (12 Months): 110,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1401959
What is a DOI?

ABSTRACT

In the context of civil rights law, discrimination refers to unfair or unequal treatment of people based on membership to a category or a minority, without regard to individual merit. Rules extracted from databases by data mining techniques, such as classification or association rules, when used for decision tasks such as benefit or credit approval, can be discriminatory in the above sense. In this paper, the notion of discriminatory classification rules is introduced and studied. Providing a guarantee of non-discrimination is shown to be a non trivial task. A naive approach, like taking away all discriminatory attributes, is shown to be not enough when other background knowledge is available. Our approach leads to a precise formulation of the redlining problem along with a formal result relating discriminatory rules with apparently safe ones by means of background knowledge. An empirical assessment of the results on the German credit dataset is also provided.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. of VLDB 1994, pages 487--499. Morgan Kaufmann, 1994.
 
2
R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proc. of SIGMOD 2000, pages 439--450. ACM, 2000.
 
3
G. S. Becker. The Economics of Discrimination. University of Chicago Press, 1957.
 
4
C. Clifton. Privacy preserving data mining: How do we mine data when we aren't allowed to see it? Tutorial at KDD 2003. http://www.cs.purdue.edu/homes/clifton.
 
5
B. Goethals. Frequent Itemset Mining Implementations Repository, http://fimi.cs.helsinki.fi.
 
6
H. Holzer, S. Raphael, and M. Stoll. Black job applicants and the hiring officer's race. Industrial and Labor Relations Review, 57(2):267--287, 2004.
 
7
D.H. Kaye and M. Aickin, editors. Statistical Methods in Discrimination Litigation. Marcel Dekker, Inc., 1992.
 
8
P. Kuhn. Sex discrimination in labor markets: The role of statistical evidence. The American Economic Review, 77:567--583, 1987.
 
9
M. LaCour-Little. Discrimination in mortgage lending: A critical review of the literature. J. of Real Estate Literature, 7:15--50, 1999.
 
10
B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. of KDD 1998, pages 80--86. AAAI Press, 1998.
 
11
D.J. Newman, S. Hettich, C.L. Blake, and C.J. Merz. UCI repository of machine learning databases, 1998. http://archive.ics.uci.edu/ml.
 
12
D. Pedreschi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. Tech. Rep. 07-19, Dip. Inf., Univ. of Pisa, 2007. http://compass2.di.unipi.it/TR.
 
13
J. Rauch and M. Simunek. Mining for association rules by 4ft-Miner. In Proc. of INAP 2001, pages 285--295. Prolog Association of Japan, 2001. http://lispminer.vse.cz.
 
14
G. D. Squires. Racial profiling, insurance style: Insurance redlining and the uneven development of metropolitan areas. J. of Urban Affairs, 25(4):391--410, 2003.
 
15
L. Sweeney. Computational Disclosure Control: A Primer on Data Privacy Protection. PhD thesis, MIT, 2001.
 
16
P.-N. Tan, V. Kumar, and J. Srivastava. Selecting the right objective measure for association analysis. Inf. Syst., 29(4):293--313, 2004.
 
17
U.S. Federal Legislation. http://www.usdoj.gov.
 
18
X. Yin and J. Han. CPAR: Classification based on Predictive Association Rules. In Proc. of SIAM DM 2003, SIAM, 2003.

Collaborative Colleagues:
Dino Pedreshi: colleagues
Salvatore Ruggieri: colleagues
Franco Turini: colleagues