skip to main content
10.1145/2517312.2517320acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Approaches to adversarial drift

Published:04 November 2013Publication History

ABSTRACT

In this position paper, we argue that to be of practical interest, a machine-learning based security system must engage with the human operators beyond feature engineering and instance labeling to address the challenge of drift in adversarial environments. We propose that designers of such systems broaden the classification goal into an explanatory goal, which would deepen the interaction with system's operators.

To provide guidance, we advocate for an approach based on maintaining one classifier for each class of unwanted activity to be filtered. We also emphasize the necessity for the system to be responsive to the operators constant curation of the training set. We show how this paradigm provides a property we call isolation and how it relates to classical causative attacks.

In order to demonstrate the effects of drift on a binary classification task, we also report on two experiments using a previously unpublished malware data set where each instance is timestamped according to when it was seen.

References

  1. U. Bayer, P. M. Comparetti, C. H. C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In NDSS, 2009.Google ScholarGoogle Scholar
  2. B. Biggio, I. Corona, and G. Fumera. Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In Multiple Classifier Systems, pages 350--359. Springer Berlin Heidelberg, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Biggio, G. Fumera, and F. Roli. Evade hard multiple classifier systems. In Applications of Supervised and Unsupervised Ensemble Methods, pages 15--38. Springer Berlin Heidelberg, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  4. L. Bottou and O. Bousquet. The Tradeoffs of Large-Scale Learning. Advances in Neural Information Processing Systems, 20:161--168, 2008.Google ScholarGoogle Scholar
  5. M. Brückner, C. Kanzow, and T. Scheffer. Static prediction games for adversarial learning problems. Journal of Machine Learning Research, 13:2617--2654, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Brückner and T. Scheffer. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 547--555, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. V. Castelli and T. M. Cover. On the exponential value of labeled samples. Pattern Recognition Letters, 16, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, and A. D. Keromytis. Casting out demons: Sanitizing training data for anomaly sensors. In Security and Privacy, 2008. SP 2008. IEEE Symposium on, pages 81--95. IEEE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert. ZOZZLE: Fast and precise in-browser JavaScript malware detection. In Proceedings of the 20th USENIX conference on Security, SEC'11, pages 3--3, Berkeley, CA, USA, 2011. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. Dalvi, P. Domingos, S. Sanghai, and D. Verma. Adversarial classification. In Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining KDD 04 (2004), page 99, New York, New York, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. In Proceedings of the 2012 IEEE Symposium on Security and Privacy, SP '12, pages 332--346, Washington, DC, USA, 2012. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Fan, K. Chang, C. Hsieh, X. Wang, and Lin. LIBLINEAR : A Library for Large Linear Classification. The Journal of Machine Learning Research, 9(2008):1871--1874, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Gennari and D. French. Defining malware families based on analyst insights. In Technologies for Homeland Security (HST), 2011 IEEE International Conference on, pages 396--401, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  14. P. Graham. A plan for spam. http://www.paulgraham.com/spam.html, Aug. 2002.Google ScholarGoogle Scholar
  15. A. Gupta, P. Kuppili, A. Akella, and P. Barford. An empirical study of malware evolution. In First International Communication Systems and Networks and Workshops (COMSNETS 2009), pages 1--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C.-W. Hsu and C.-J. Lin. A comparison of methods for multiclass support vector machines. Neural Networks, IEEE Transactions on, 13(2):415--425, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. D. Tygar. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, AISec '11, pages 43--58, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on Amazon Mechanical Turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP '10, pages 64--67, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Kantchelian, J. Ma, L. Huang, S. Afroz, A. D. Joseph, and J. D. Tygar. Robust detection of comment spam using entropy rate. In Proceedings of the 5th ACM Workshop on Artificial Intelligence and Security, AISEC 2012. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Kołcz and C. H. Teo. Feature weighting for improved classifier robustness. In CEAS'09: Sixth conference on email and Anti-Spam, number 1, 2009.Google ScholarGoogle Scholar
  21. L. I. Kuncheva. Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In O. Okun and G. Valentini, editors, Workshop on Supervised and Unsupervised Ensemble Methods and their Applications (SUEMA), 2008.Google ScholarGoogle Scholar
  22. A. Lavoie, M. Otey, N. Ratliff, and D. Sculley. History Dependent Domain Adaptation. In Domain Adaptation Workshop at NIPS '11, 2011.Google ScholarGoogle Scholar
  23. H. Lee and A. Ng. Spam deobfuscation using a hidden markov model. In Proceedings of the Second Conference on Email and Anti-Spam, 2005.Google ScholarGoogle Scholar
  24. Z. Li, K. Zhang, Y. Xie, F. Yu, and X. Wang. Knowing your enemy: Understanding and detecting malicious web advertising. In CCS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. W. Liu and S. Chawla. Mining adversarial patterns via regularized loss minimization. Machine Learning, 81(1):69--83, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Lowd and C. Meek. Good word attacks on statistical spam filters. In Second Conference on Email and Anti-Spam (CEAS), Palo Alto, CA, 2005.Google ScholarGoogle Scholar
  27. L. Lu, R. Perdisci, and W. Lee. Surf: Detecting and measuring search poisoning. In CCS, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. A. Meyer and B. Whateley. SpamBayes: Effective open-source, Bayesian based, email classification system. In Proceedings of the Conference on Email and Anti-Spam (CEAS), July 2004.Google ScholarGoogle Scholar
  29. T. M. Mitchell. Machine Learning. McGraw-Hill, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. P. Rubinstein, U. Saini, C. Sutton, J. D. Tygar, and K. Xia. Exploiting machine learning to subvert your spam filter. In Proceedings of thenth1st USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET), pages 1--9, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In Security and Privacy, 2005 IEEE Symposium on, pages 226--241. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Newsome, B. Karp, and D. Song. Paragraph: Thwarting signature learning by training maliciously. In Recent Advances in Intrusion Detection, pages 81--105. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Y. Ng and M. I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In NIPS, pages 841--848, 2001.Google ScholarGoogle Scholar
  34. A. Ramachandran, N. Feamster, and S. Vempala. Filtering spam with behavioral blacklisting. In Proceedings of thenth14th ACM conference on Computer and communications security (CCS), pages 342--351, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. Rieck, T. Holz, C. Willems, P. Dussel, and P. Laskov. Learning and classification of malware behavior. In DIMVA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Rieck, P. Trinius, C. Willems, and T. Holz. Automatic analysis of malware behavior using machine learning. Journal of Computer Security, 19(4), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. J. Rodríguez and L. I. Kuncheva. Combining online classification approaches for changing environments. In Proc. of the Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition and Statistical Techniques in Pattern Recognition, pages 520--529, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Rokach. Ensemble-based classifiers. Artif. Intell. Rev., 33(1--2):1--39, Feb. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. B. I. Rubinstein, B. Nelson, L. Huang, A. D. Joseph, S.-h. Lau, S. Rao, N. Taft, and J. Tygar. Antidote: understanding and defending against poisoning of anomaly detectors. In Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, pages 1--14. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. G. Schwenk, A. Bikadorov, T. Krueger, and K. Rieck. Autonomous learning for detection of javascript attacks: Vision or reality? In AISEC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Sculley, M. E. Otey, M. Pohl, B. Spitznagel, J. Hainsworth, and Y. Zhou. Detecting adversarial advertisements in the wild. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 274--282. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. Sculley, G. M. Wachman, and C. E. Brodley. Spam Filtering using Inexact String Matching in Explicit Feature Space with On-Line Linear Classifiers. In The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings, 2006.Google ScholarGoogle Scholar
  43. R. Segal, J. Crawford, J. Kephart, and B. Leiba. SpamGuru: An enterprise anti-spam filtering system. In Conference on Email and Anti-Spam (CEAS), 2004.Google ScholarGoogle Scholar
  44. A. Singh, A. Walenstein, and A. Lakhotia. Tracking concept drift in malware families. In Proceedings of the 5th ACM workshop on Security and artificial intelligence, pages 81--92. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. Sommer and V. Paxson. Outside the closed world: On using machine learning for network intrusion detection. In Security and Privacy (SP), 2010 IEEE Symposium on, pages 305--316. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. N. Srndic and P. Laskov. Detection of malicious pdf files based on hierarchical document structure. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2013, San Diego, California, USA. The Internet Society, 2013.Google ScholarGoogle Scholar
  47. T. Stein, E. Chen, and K. Mangla. Facebook immune system. In Proceedings of the 4th Workshop on Social Network Systems, SNS '11, pages 8:1--8:8, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and evaluation of a real-time URL spam filtering service. In 2011 IEEE Symposium on Security and Privacy (SP), pages 447--462. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. C. Whittaker, B. Ryner, and M. Nazif. Large-scale automatic classification of phishing pages. In Proc. of 17th NDSS, 2010.Google ScholarGoogle Scholar
  50. M. M. Williamson. Throttling viruses: Restricting propagation to defeat malicious mobile code. In Proceedings of thenth18th Annual Computer Security Applications Conference (ACSAC), pages 61--68, Washington DC, USA, 2002. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. G. Wittel and S. Wu. On attacking statistical spam filters. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.Google ScholarGoogle Scholar
  52. C. V. Wright, S. E. Coull, and F. Monrose. Traffic morphing: An efficient defense against statistical traffic analysis. In NDSS. The Internet Society, 2009.Google ScholarGoogle Scholar

Index Terms

  1. Approaches to adversarial drift

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
        November 2013
        116 pages
        ISBN:9781450324885
        DOI:10.1145/2517312

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 November 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        AISec '13 Paper Acceptance Rate10of17submissions,59%Overall Acceptance Rate94of231submissions,41%

        Upcoming Conference

        CCS '24
        ACM SIGSAC Conference on Computer and Communications Security
        October 14 - 18, 2024
        Salt Lake City , UT , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader