skip to main content
10.1145/3175684.3175723acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbdiotConference Proceedingsconference-collections
research-article

Detecting and Adapting to Concept Drift in Continually Evolving Stochastic Processes

Published:20 December 2017Publication History

ABSTRACT

Many real world stochastic processes are non-stationary, which means that the probability distribution that generates data samples is time-varying. In the context of machine learning, this phenomenon is known as concept drift. It is important that machine learning models are able to adapt to concept drift in order to prevent degradation in accuracy. In this paper, we present two algorithms for drift detection and adaptation.

Drift is measured by continuously tracking a difference metric between probability distributions estimated from two sample windows preceding a time point. High values for the difference metric indicates that concept drift has occurred, and the model must be adapted. Adaptation is done by training a new model for the drifted process, and adding it to an ensemble of models. Previously trained models are retained, and their weights in the ensemble are adjusted to reflect similarity with the current probability distribution of the process. Experiments on simulated drift scenarios as well as real world datasets show that our algorithms detect drift with high accuracy, and adaptation results in improved model accuracy.

References

  1. Webb, Geoffrey I., et al. "Understanding Concept Drift." arXiv preprint arXiv:1704.00362 (2017).Google ScholarGoogle Scholar
  2. Gama, Joao, et al. "Learning with drift detection." Brazilian Symposium on Artificial Intelligence. Springer, Berlin, Heidelberg, 2004.Google ScholarGoogle Scholar
  3. Žliobaitė, Indrė. "Learning under concept drift: an overview." arXiv preprint arXiv:1010.4784 (2010).Google ScholarGoogle Scholar
  4. Tsymbal, Alexey. "The problem of concept drift: definitions and related work." Computer Science Department, Trinity College Dublin 106.2 (2004).Google ScholarGoogle Scholar
  5. Street, W. Nick, and YongSeog Kim. "A streaming ensemble algorithm (SEA) for large-scale classification." Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Wang, Haixun, et al. "Mining concept-drifting data streams using ensemble classifiers." Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. AcM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Koychev. Koychev, Ivan. "Gradual forgetting for adaptation to concept drift." Proceedings of ECAI 2000 Workshop on Current Issues in Spatio-Temporal Reasoning, 2000.Google ScholarGoogle Scholar
  8. Zhang, Peng, Xingquan Zhu, and Yong Shi. "Categorizing and mining concept drifting data streams." Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Scholz, Martin, and Ralf Klinkenberg. "An ensemble classifier for drifting concepts." Proceedings of the Second International Workshop on Knowledge Discovery in Data Streams. Porto, Portugal, 2005.Google ScholarGoogle Scholar
  10. Royer, Amelie, and Christoph H. Lampert. "Classifier adaptation at prediction time." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.Google ScholarGoogle Scholar
  11. Hoffman, Judy, Trevor Darrell, and Kate Saenko. "Continuous manifold based adaptation for evolving visual domains." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Levinkov, Evgeny, and Mario Fritz. "Sequential Bayesian model update under structured scene prior for semantic road scenes labeling." Proceedings of the IEEE International Conference on Computer Vision. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Levin, David Asher, Yuval Peres, and Elizabeth Lee Wilmer. Markov chains and mixing times. American Mathematical Soc., 2009.Google ScholarGoogle Scholar
  14. Michael Harries. Splice-2 comparative evaluation: Electricity pricing. Technical reportGoogle ScholarGoogle Scholar
  15. Bifet, Albert, and Ricard Gavalda. "Learning from time-changing data with adaptive windowing." Proceedings of the 2007 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, 2007.Google ScholarGoogle Scholar
  16. Raykar, Vikas C., Ramani Duraiswami, and Linda H. Zhao. "Fast computation of kernel estimators." Journal of Computational and Graphical Statistics 19.1 (2010): 205--220.Google ScholarGoogle ScholarCross RefCross Ref
  17. Elgammal, Ahmed, Ramani Duraiswami, and Larry S. Davis. "Efficient kernel density estimation using the fast gauss transform with applications to color modeling and tracking." IEEE transactions on pattern analysis and machine intelligence 25.11 (2003): 1499--1504. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Detecting and Adapting to Concept Drift in Continually Evolving Stochastic Processes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      BDIOT '17: Proceedings of the International Conference on Big Data and Internet of Thing
      December 2017
      251 pages
      ISBN:9781450354301
      DOI:10.1145/3175684

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 December 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate75of136submissions,55%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader