skip to main content
article

Parameter optimized, vertical, nearest-neighbor-vote and boundary-based classification

Published: 01 December 2006 Publication History

Abstract

In this paper, we describe a reliable high performance classification system that includes a Nearest Neighbor Vote based classification and a Local Decision Boundary based classification combined with an evolutionary algorithm for parameter optimization and a vertical data structure (Predicate Tree or P-tree1 for processing efficiency.

References

[1]
Abidin T., and Perrizo W. SMART-TV: A Fast and Scalable Nearest Neighbor Based Classifier for Data Mining. Proceedings of the 21st ACM Symposium on Applied Computing, Dijon, France, April 2006.
[2]
Abidin, T., Perera, A., Serazi, M., Perrizo, W., Vertical Set Square Distance: A Fast and Scalable Technique to Compute Total Variation in Large Datasets, CATA-2005 New Orleans, 2005. Q. Ding, M. Khan, A. Roy, and W. Perrizo, The P-tree Algebra, Proceedings of the ACM Sym. on App. Comp., pp. 426--431, 2002.
[3]
Bandyopadhyay, S., and Muthy, C. A., Pattern Classification Using Genetic Algorithms. Pattern Recognition Letters, Vol. 16, (1995) 801--808.
[4]
Cost, S. and Salzberg, S., A weighted nearest neighbor algorithm for learning with symbolicfeatures, Machine Learning, 10, 57--78, 1993.
[5]
DataSURG, P-tree Application Programming Interface Documentation, North Dakota State University. http://midas.cs.ndsu.nodak.edu/~datasurg/ptree/
[6]
Ding, Q., Ding, Q., Perrizo, W., "ARM on RSI Using Ptrees," Pacific-Asia KDD Conf., pp. 66--79, Taipei, May2002.
[7]
Duch W, Grudzi N. K., and Diercksen G., Neural Minimal Distance Methods., World Congress of Computational intelligence, May 1998, Anchorage, Alaska, IJCNN'98 Proceedings, pp. 1299--1304.
[8]
Goldberg, D. E., Genetic Algorithms in Search Optimization, and Machine Learning, Addison Wesley, 1989.
[9]
Guerra-Salcedo C., and Whitley D., Feature Selection mechanisms for ensemble creation: a genetic search perspective, Data Mining with Evolutionary Algorithms: Research Directions -- Papers from the AAAI Workshop, 13--17. Technical Report WS-99-06. AAAI Press (1999).
[10]
Jain, A. K.; Zongker, D. Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 19, No. 2, February (1997).
[11]
Khan M., Ding Q., Perrizo W., k-Nearest Neighbor Classification on Spatial Data Streams Using P-trees, Advances in KDD, Springer Lecture Notes in Artificial Intelligence, LNAI 2336, 2002, pp 517--528.
[12]
Khan, M., Ding, Q., and Perrizo, W., K-Nearest Neighbor Classification of Spatial Data Streams using P-trees, Proceedings of the PAKDD, pp. 517--528, 2002.
[13]
Krishnaiah, P. R., and Kanal L. N., Handbook of statistics 2: classification, pattern recognition and reduction of dimensionality. North Holland, Amsterdam 1982.
[14]
Kuncheva, L. I., and Jain, L. C.: Designing Classifier Fusion Systems by Genetic Algorithms. IEEE Transaction on Evolutionary Computation, Vol. 33 (2000) 351--373.
[15]
Lane, T., ACM Knowledge Discovery and Data Mining Cup 2006, http://www.kdd2006.com/kddcup.html
[16]
M. J. Zaki, K. Gouda, Fast Vertical Mining Using Diffsets, Special Interest Group in Knowledge discovery and Data Mining (SIGKDD), Washington DC, August 2003.
[17]
Martin-Bautista M. J., and Vila M. A.: A survey of genetic feature selection in mining issues. ProceedingCongress on Evolutionary Computation (CEC-99), Washington D.C., July (1999) 1314--1321.
[18]
Perera, A., Abidin T., Serazi, M. Hamer, G., Perrizo, W., Vertical Set Square Distance Based Clustering without Prior Knowledge of K, 14th International Conference on Intelligent and Adaptive Systems and Software Engineering (IASSE'05), Toronto, Canada, 2004.
[19]
Perera, A., Denton A., Kotala P., Jockheck W., Valdivia W., Perrizo W., P-tree Classification of Yeast Gene Deletion Data, SIGKDD Explorations, Volume 4, Issue 2, December 2002.
[20]
Punch, W. F. Goodman, E. D., Pei, M., Chia-Shun, L., Hovland, P., and Enbody, R., Further research on feature selection and classification using genetic algorithms, Proc. of the Fifth Int. Conf. on Genetic Algorithms, pp 557--564, San Mateo, CA, 1993.
[21]
Rahal, I. and Perrizo, W., An Optimized Approach for KNN Text Categorization using P-Trees. Proceedings. of ACM Symposium on Applied Computing, pp. 613--617, 2004.
[22]
Raymer, M. L. Punch, W. F., Goodman, E. D., Kuhn, L. A., and Jain, A. K.: Dimensionality Reduction Using Genetic Algorithms. IEEE Transactions on Evolutionary Computation, Vol. 4, (2000) 164--171
[23]
Serazi, M. Perera, A., Ding, Q., Malakhov, V., Rahal, L, Pan, F., Ren, D., Wu, W., and Perrizo, W., ACM SIGMOD, Paris, France, June 2004.
[24]
Shenoy, P. Haritsa, J. R., Sudarshan, S., Turbo-charging Vertical Mining of Large Databases, International Conference in Management of Data, May 2000.
[25]
Vafaie, H. and De Jong, K.: Robust feature Selection algorithms. Proceeding of IEEE International Conference on Tools with AI, Boston, Mass., USA. November. (1993) 356--363.

Index Terms

  1. Parameter optimized, vertical, nearest-neighbor-vote and boundary-based classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGKDD Explorations Newsletter
    ACM SIGKDD Explorations Newsletter  Volume 8, Issue 2
    December 2006
    106 pages
    ISSN:1931-0145
    EISSN:1931-0153
    DOI:10.1145/1233321
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 December 2006
    Published in SIGKDD Volume 8, Issue 2

    Check for updates

    Author Tags

    1. boundary
    2. classification
    3. nearest neighbor
    4. predicate tree
    5. total variation

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 139
      Total Downloads
    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media