research-article

Approaches to adversarial drift

Authors:
Alex Kantchelian

University of California at Berkeley, Berkeley, CA, USA

University of California at Berkeley, Berkeley, CA, USA
View Profile

,
Sadia Afroz

University of Drexel, Philadelphia, PA, USA

University of Drexel, Philadelphia, PA, USA
View Profile

,
Ling Huang

Intel Labs, Berkeley, CA, USA

Intel Labs, Berkeley, CA, USA
View Profile

,
Aylin Caliskan Islam

University of Drexel, Philadelphia, PA, USA

University of Drexel, Philadelphia, PA, USA
View Profile

,
Brad Miller

University of California at Berkeley, Berkeley, CA, USA

University of California at Berkeley, Berkeley, CA, USA
View Profile

,
Michael Carl Tschantz

University of California at Berkeley, Berkeley, CA, USA

University of California at Berkeley, Berkeley, CA, USA
View Profile

,
Rachel Greenstadt

University of Drexel, Philadephia, PA, USA

University of Drexel, Philadephia, PA, USA
View Profile

,
Anthony D. Joseph

University of California at Berkeley, Berkeley, CA, USA

University of California at Berkeley, Berkeley, CA, USA
View Profile

,
J. D. Tygar

University of California at Berkeley, Berkeley, CA, USA

University of California at Berkeley, Berkeley, CA, USA
View Profile

AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and securityNovember 2013Pages 99–110https://doi.org/10.1145/2517312.2517320

Published:04 November 2013Publication History

AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security

Pages 99–110

ABSTRACT

In this position paper, we argue that to be of practical interest, a machine-learning based security system must engage with the human operators beyond feature engineering and instance labeling to address the challenge of drift in adversarial environments. We propose that designers of such systems broaden the classification goal into an explanatory goal, which would deepen the interaction with system's operators.

To provide guidance, we advocate for an approach based on maintaining one classifier for each class of unwanted activity to be filtered. We also emphasize the necessity for the system to be responsive to the operators constant curation of the training set. We show how this paradigm provides a property we call isolation and how it relates to classical causative attacks.

In order to demonstrate the effects of drift on a binary classification task, we also report on two experiments using a previously unpublished malware data set where each instance is timestamped according to when it was seen.

References

U. Bayer, P. M. Comparetti, C. H. C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In NDSS, 2009.Google Scholar
B. Biggio, I. Corona, and G. Fumera. Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In Multiple Classifier Systems, pages 350--359. Springer Berlin Heidelberg, 2011. Google ScholarDigital Library
B. Biggio, G. Fumera, and F. Roli. Evade hard multiple classifier systems. In Applications of Supervised and Unsupervised Ensemble Methods, pages 15--38. Springer Berlin Heidelberg, 2009.Google ScholarCross Ref
L. Bottou and O. Bousquet. The Tradeoffs of Large-Scale Learning. Advances in Neural Information Processing Systems, 20:161--168, 2008.Google Scholar
M. Brückner, C. Kanzow, and T. Scheffer. Static prediction games for adversarial learning problems. Journal of Machine Learning Research, 13:2617--2654, 2012. Google ScholarDigital Library
M. Brückner and T. Scheffer. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 547--555, 2011. Google ScholarDigital Library
V. Castelli and T. M. Cover. On the exponential value of labeled samples. Pattern Recognition Letters, 16, 1995. Google ScholarDigital Library
G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, and A. D. Keromytis. Casting out demons: Sanitizing training data for anomaly sensors. In Security and Privacy, 2008. SP 2008. IEEE Symposium on, pages 81--95. IEEE, 2008. Google ScholarDigital Library
C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert. ZOZZLE: Fast and precise in-browser JavaScript malware detection. In Proceedings of the 20th USENIX conference on Security, SEC'11, pages 3--3, Berkeley, CA, USA, 2011. USENIX Association. Google ScholarDigital Library
N. Dalvi, P. Domingos, S. Sanghai, and D. Verma. Adversarial classification. In Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining KDD 04 (2004), page 99, New York, New York, USA, 2004. ACM Press. Google ScholarDigital Library
K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. In Proceedings of the 2012 IEEE Symposium on Security and Privacy, SP '12, pages 332--346, Washington, DC, USA, 2012. IEEE Computer Society. Google ScholarDigital Library
R. Fan, K. Chang, C. Hsieh, X. Wang, and Lin. LIBLINEAR : A Library for Large Linear Classification. The Journal of Machine Learning Research, 9(2008):1871--1874, 2008. Google ScholarDigital Library
J. Gennari and D. French. Defining malware families based on analyst insights. In Technologies for Homeland Security (HST), 2011 IEEE International Conference on, pages 396--401, 2011.Google ScholarCross Ref
P. Graham. A plan for spam. http://www.paulgraham.com/spam.html, Aug. 2002.Google Scholar
A. Gupta, P. Kuppili, A. Akella, and P. Barford. An empirical study of malware evolution. In First International Communication Systems and Networks and Workshops (COMSNETS 2009), pages 1--10, 2009. Google ScholarDigital Library
C.-W. Hsu and C.-J. Lin. A comparison of methods for multiclass support vector machines. Neural Networks, IEEE Transactions on, 13(2):415--425, 2002. Google ScholarDigital Library
L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. D. Tygar. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, AISec '11, pages 43--58, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on Amazon Mechanical Turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP '10, pages 64--67, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
A. Kantchelian, J. Ma, L. Huang, S. Afroz, A. D. Joseph, and J. D. Tygar. Robust detection of comment spam using entropy rate. In Proceedings of the 5th ACM Workshop on Artificial Intelligence and Security, AISEC 2012. ACM, 2012. Google ScholarDigital Library
A. Kołcz and C. H. Teo. Feature weighting for improved classifier robustness. In CEAS'09: Sixth conference on email and Anti-Spam, number 1, 2009.Google Scholar
L. I. Kuncheva. Classifier ensembles for detecting concept change in streaming data: Overview and perspectives. In O. Okun and G. Valentini, editors, Workshop on Supervised and Unsupervised Ensemble Methods and their Applications (SUEMA), 2008.Google Scholar
A. Lavoie, M. Otey, N. Ratliff, and D. Sculley. History Dependent Domain Adaptation. In Domain Adaptation Workshop at NIPS '11, 2011.Google Scholar
H. Lee and A. Ng. Spam deobfuscation using a hidden markov model. In Proceedings of the Second Conference on Email and Anti-Spam, 2005.Google Scholar
Z. Li, K. Zhang, Y. Xie, F. Yu, and X. Wang. Knowing your enemy: Understanding and detecting malicious web advertising. In CCS, 2012. Google ScholarDigital Library
W. Liu and S. Chawla. Mining adversarial patterns via regularized loss minimization. Machine Learning, 81(1):69--83, July 2010. Google ScholarDigital Library
D. Lowd and C. Meek. Good word attacks on statistical spam filters. In Second Conference on Email and Anti-Spam (CEAS), Palo Alto, CA, 2005.Google Scholar
L. Lu, R. Perdisci, and W. Lee. Surf: Detecting and measuring search poisoning. In CCS, 2011. Google ScholarDigital Library
T. A. Meyer and B. Whateley. SpamBayes: Effective open-source, Bayesian based, email classification system. In Proceedings of the Conference on Email and Anti-Spam (CEAS), July 2004.Google Scholar
T. M. Mitchell. Machine Learning. McGraw-Hill, 1997. Google ScholarDigital Library
B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. P. Rubinstein, U. Saini, C. Sutton, J. D. Tygar, and K. Xia. Exploiting machine learning to subvert your spam filter. In Proceedings of thenth1st USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET), pages 1--9, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library
J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In Security and Privacy, 2005 IEEE Symposium on, pages 226--241. IEEE, 2005. Google ScholarDigital Library
J. Newsome, B. Karp, and D. Song. Paragraph: Thwarting signature learning by training maliciously. In Recent Advances in Intrusion Detection, pages 81--105. Springer, 2006. Google ScholarDigital Library
A. Y. Ng and M. I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In NIPS, pages 841--848, 2001.Google Scholar
A. Ramachandran, N. Feamster, and S. Vempala. Filtering spam with behavioral blacklisting. In Proceedings of thenth14th ACM conference on Computer and communications security (CCS), pages 342--351, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
K. Rieck, T. Holz, C. Willems, P. Dussel, and P. Laskov. Learning and classification of malware behavior. In DIMVA, 2008. Google ScholarDigital Library
K. Rieck, P. Trinius, C. Willems, and T. Holz. Automatic analysis of malware behavior using machine learning. Journal of Computer Security, 19(4), 2011. Google ScholarDigital Library
J. J. Rodríguez and L. I. Kuncheva. Combining online classification approaches for changing environments. In Proc. of the Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition and Statistical Techniques in Pattern Recognition, pages 520--529, 2008. Google ScholarDigital Library
L. Rokach. Ensemble-based classifiers. Artif. Intell. Rev., 33(1--2):1--39, Feb. 2010. Google ScholarDigital Library
B. I. Rubinstein, B. Nelson, L. Huang, A. D. Joseph, S.-h. Lau, S. Rao, N. Taft, and J. Tygar. Antidote: understanding and defending against poisoning of anomaly detectors. In Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, pages 1--14. ACM, 2009. Google ScholarDigital Library
G. Schwenk, A. Bikadorov, T. Krueger, and K. Rieck. Autonomous learning for detection of javascript attacks: Vision or reality? In AISEC, 2012. Google ScholarDigital Library
D. Sculley, M. E. Otey, M. Pohl, B. Spitznagel, J. Hainsworth, and Y. Zhou. Detecting adversarial advertisements in the wild. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 274--282. ACM, 2011. Google ScholarDigital Library
D. Sculley, G. M. Wachman, and C. E. Brodley. Spam Filtering using Inexact String Matching in Explicit Feature Space with On-Line Linear Classifiers. In The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings, 2006.Google Scholar
R. Segal, J. Crawford, J. Kephart, and B. Leiba. SpamGuru: An enterprise anti-spam filtering system. In Conference on Email and Anti-Spam (CEAS), 2004.Google Scholar
A. Singh, A. Walenstein, and A. Lakhotia. Tracking concept drift in malware families. In Proceedings of the 5th ACM workshop on Security and artificial intelligence, pages 81--92. ACM, 2012. Google ScholarDigital Library
R. Sommer and V. Paxson. Outside the closed world: On using machine learning for network intrusion detection. In Security and Privacy (SP), 2010 IEEE Symposium on, pages 305--316. IEEE, 2010. Google ScholarDigital Library
N. Srndic and P. Laskov. Detection of malicious pdf files based on hierarchical document structure. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2013, San Diego, California, USA. The Internet Society, 2013.Google Scholar
T. Stein, E. Chen, and K. Mangla. Facebook immune system. In Proceedings of the 4th Workshop on Social Network Systems, SNS '11, pages 8:1--8:8, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and evaluation of a real-time URL spam filtering service. In 2011 IEEE Symposium on Security and Privacy (SP), pages 447--462. IEEE, 2011. Google ScholarDigital Library
C. Whittaker, B. Ryner, and M. Nazif. Large-scale automatic classification of phishing pages. In Proc. of 17th NDSS, 2010.Google Scholar
M. M. Williamson. Throttling viruses: Restricting propagation to defeat malicious mobile code. In Proceedings of thenth18th Annual Computer Security Applications Conference (ACSAC), pages 61--68, Washington DC, USA, 2002. IEEE Computer Society. Google ScholarDigital Library
G. Wittel and S. Wu. On attacking statistical spam filters. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.Google Scholar
C. V. Wright, S. E. Coull, and F. Monrose. Traffic morphing: An efficient defense against statistical traffic analysis. In NDSS. The Internet Society, 2009.Google Scholar

Index Terms

Approaches to adversarial drift
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Systems security
    1. Operating systems security

Recommendations

Deceiving Portable Executable Malware Classifiers into Targeted Misclassification with Practical Adversarial Examples
CODASPY '20: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy

Due to voluminous malware attacks in the cyberspace, machine learning has become popular for automating malware detection and classification. In this work we play devil's advocate by investigating a new type of threats aimed at deceiving multi-class ...
Read More
Vulnerability assessment of machine learning based malware classification models
GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference Companion

The primary focus of the machine learning model is to train a system to achieve self-reliance. However, due to the absence of the inbuilt security functions the learning phase itself is not secured which allows attacker to exploit the security ...
Read More
On the reliable detection of concept drift from streaming unlabeled data

New classifier-independent, dynamic, unsupervised approach for detecting concept drift.Reduced number of false alarms and increased relevance of drift detection.Results comparable to supervised approaches, which require fully labeled streams.Our ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
November 2013
116 pages
ISBN:9781450324885
DOI:10.1145/2517312
General Chair:
Ahmad-Reza Sadeghi
TU Darmstadt, CASED, Intel ICRI-SC, Germany
,
Program Chairs:
Blaine Nelson
University of Potsdam, Germany
,
Christos Dimitrakakis
Chalmers University of Technology, Sweden
,
Elaine Shi
University of Maryland, College Park, MD, USA
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 November 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adversarial machine learning
concept drift
malware classification
Qualifiers
- research-article
Conference

Acceptance Rates
AISec '13 Paper Acceptance Rate10of17submissions,59%Overall Acceptance Rate94of231submissions,41%
More
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 41
  Total Citations
  View Citations
- 543
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Approaches to adversarial drift

AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deceiving Portable Executable Malware Classifiers into Targeted Misclassification with Practical Adversarial Examples

Vulnerability assessment of machine learning based malware classification models

On the reliable detection of concept drift from streaming unlabeled data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Approaches to adversarial drift

AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deceiving Portable Executable Malware Classifiers into Targeted Misclassification with Practical Adversarial Examples

Vulnerability assessment of machine learning based malware classification models

On the reliable detection of concept drift from streaming unlabeled data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media