| Detection and prediction of distance-based outliers |
| Full text |
Pdf
(11.41 MB)
|
| Source
|
Symposium on Applied Computing
archive
Proceedings of the 2005 ACM symposium on Applied computing
table of contents
Santa Fe, New Mexico
SESSION: Data mining (DM)
table of contents
Pages: 537 - 542
Year of Publication: 2005
ISBN:1-58113-964-0
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 69, Citation Count: 0
|
|
|
ABSTRACT
In this paper we present an unsupervised distance-based outlier detection method designed to learn a model over the objects contained in a data set. The learned model, called solving set, is a small subset of the data set that is used to classify new unseen objects as outliers or not. We provide an algorithm that computes a solving set with sub-quadratic time requirements, and we give experimental evidence that the computed solving set is small and that the false positive rate, i.e. the fraction of new objects misclassified as outliers using the solving set instead of the overall data set, is negligible.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
A. Arning, C. Aggarwal, and P. Raghavan. A linear method for deviation detection in large databases. In Proc. Int. Conf. on Knowledge Discovery and Data Mining(KDD'96), pages 164--169, 1996.
|
| |
3
|
V. Barnett and T. Lewis. Outliers in Statistical Data. John Wiley & Sons, 1994.
|
 |
4
|
|
 |
5
|
Markus M. Breunig , Hans-Peter Kriegel , Raymond T. Ng , Jörg Sander, LOF: identifying density-based local outliers, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.93-104, May 15-18, 2000, Dallas, Texas, United States
|
| |
6
|
E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo. A geometric framework for unsupervised anomaly detection. In Applications of Data Mining in Computer Security, Kluwer, 2002.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava. A comparative study of anomaly detection schemes in network intrusion detection. In Proc. SIAM Int. Conf. on Data Mining, 2003.
|
| |
14
|
W. Lee, S. J. Stolfo, and K. W. Mok. Mining audit data to build intrusion detection models. In Proc. Int. Conf on Knowledge Discovery and Data Mining (KDD-98), pages 66--72, 1998.
|
 |
15
|
Sridhar Ramaswamy , Rajeev Rastogi , Kyuseok Shim, Efficient algorithms for mining outliers from large data sets, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.427-438, May 15-18, 2000, Dallas, Texas, United States
|
 |
16
|
|
|