ABSTRACT
Machine learning has become a valuable tool for detecting and preventing malicious activity. However, as more applications employ machine learning techniques in adversarial decision-making situations, increasingly powerful attacks become possible against machine learning systems. In this paper, we present three broad research directions towards the end of developing truly secure learning. First, we suggest that finding bounds on adversarial influence is important to understand the limits of what an attacker can and cannot do to a learning system. Second, we investigate the value of adversarial capabilities-the success of an attack depends largely on what types of information and influence the attacker has. Finally, we propose directions in technologies for secure learning and suggest lines of investigation into secure techniques for learning in adversarial environments. We intend this paper to foster discussion about the security of machine learning, and we believe that the research directions we propose represent the most important directions to pursue in the quest for secure learning.
- Marco Barreno, Blaine Nelson, Anthony D. Joseph, and J. D. Tygar. The security of machine learning. Technical Report UCB/EECS-2008-43, EECS Department, University of California, Berkeley, April 2008.Google ScholarCross Ref
- Marco Barreno, Blaine Nelson, Russell Sears, Anthony D. Joseph, and J. D. Tygar. Can machine learning be secure? In Proceedings of the ACM Symposium on InformAtion, Computer, and Communications Security (ASIACCS'06), March 2006. Google ScholarDigital Library
- Nicolo Cesa-Bianchi and Gabor Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006. Google ScholarDigital Library
- Simon P. Chung and Aloysius K. Mok. Allergy attack against automatic signature generation. In Recent Advances in Intrusion Detection (RAID), pages 61--80, 2006. Google ScholarDigital Library
- Simon P. Chung and Aloysius K. Mok. Advanced allergy attacks: Does a corpus really help? In Recent Advances in Intrusion Detection (RAID), pages 236--255, 2007. Google ScholarDigital Library
- Nilesh Dalvi, Pedro Domingos, Mausam, Sumit Sanghai, and Deepak Verma. Adversarial classification. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 99--108, Seattle, WA, 2004. ACM Press. Google ScholarDigital Library
- Prahlad Fogla and Wenke Lee. Evading network anomaly detection systems: Formal reasoning and practical techniques. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), pages 59--68, 2006. Google ScholarDigital Library
- Jason Franklin, Vern Paxson, Adrian Perrig, and Stefan Savage. An inquiry into the nature and causes of the wealth of internet miscreants. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), 2007. Google ScholarDigital Library
- Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119--139, 1997. Google ScholarDigital Library
- Frank R. Hampel, Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. Robust Statistics: The Approach Based on Influence Functions. Probability and Mathematical Statistics. John Wiley and Sons, 1986.Google Scholar
- Peter J. Huber. Robust Statistics. John Wiley and Sons, 1981.Google Scholar
- Michael Kearns and Ming Li. Learning in the presence of malicious errors. SIAM Journal on Computing, 22:807--837, 1993. Google ScholarDigital Library
- Anukool Lakhina, Mark Crovella, and Christophe Diot. Diagnosing network--wide traffic anomalies. In Proc. SIGCOMM '04, pages 219--230, 2004. Google ScholarDigital Library
- Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao, and Brian Chavez. Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience. In IEEE Symposium on Security and Privacy, 2006. Google ScholarDigital Library
- Daniel Lowd and Christopher Meek. Adversarial learning. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 641--647, 2005. Google ScholarDigital Library
- Daniel Lowd and Christopher Meek. Good word attacks on statistical spam filters. In Proceedings of the Second Conference on Email and Anti-Spam (CEAS), 2005.Google Scholar
- Markos Markou and Sameer Singh. Novelty detection: a review--part 1: statistical approaches. Signal Processing, 83(12):2481--2497, December 2003. Google ScholarDigital Library
- Ricardo A. Maronna, Douglas R. Martin, and Victor J. Yohai. Robust Statistics: Theory and Methods. John Wiley and Sons, New York, 2006.Google Scholar
- Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles Sutton, J. D. Tygar, and Kai Xia. Exploiting machine learning to subvert your spam filter. In Proceedings of the First Workshop on Large-scale Exploits and Emerging Threats (LEET), 2008. Google ScholarDigital Library
- Blaine Nelson and Anthony D. Joseph. Bounding an attack's complexity for a simple learning model. In Proceedings of the First Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SysML), 2006.Google Scholar
- James Newsome, Brad Karp, and Dawn Song. Polygraph: Automatically generating signatures for polymorphic worms. In Proceedings of the IEEE Symposium on Security and Privacy, pages 226--241, May 2005. Google ScholarDigital Library
- James Newsome, Brad Karp, and Dawn Song. Paragraph: Thwarting signature learning by training maliciously. In Proceedings of the 9th International Symposium on Recent Advances in Intrusion Detection (RAID 2006), September 2006.Google ScholarDigital Library
- PhishTank. http://www.phishtank.com.Google Scholar
- Gary Robinson. A statistical approach to the spam problem. Linux Journal, March 2003. Google ScholarDigital Library
- Benjamin I. P. Rubinstein, Blaine Nelson, Ling Huang, Anthony D. Joseph, Shing-hon Lau, Nina Taft, and J. D. Tygar. Compromising PCA-based anomaly detectors for network-wide traffic. Technical report UCB/EECS-2008-73, UC Berkeley, May 2008.Google Scholar
- Robert E. Schapire. A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI '99), pages 1401--1406, 1999. Google ScholarDigital Library
- Kymie M. C. Tan, Kevin S. Killourhy, and Roy A. Maxion. Undermining an anomaly-based intrusion detection system using common exploits. In Recent Advances in Intrusion Detection (RAID), pages 54--73, 2002. Google ScholarDigital Library
- Vladimir N. Vapnik and Alexey Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2):264--280, 1971.Google ScholarCross Ref
- Shobha Venkataraman, Avrim Blum, and Dawn Song. Limits of learning-based signature generation with adversaries. In Proceedings of the 15th Annual Network & Distributed System Security Symposium, 2008.Google Scholar
- Gregory L. Wittel and S. Felix Wu. On attacking statistical spam filters. In Proceedings of the First Conference on Email and Anti-Spam (CEAS), 2004.Google Scholar
Index Terms
- Open problems in the security of learning
Recommendations
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain
In recent years, machine learning algorithms, and more specifically deep learning algorithms, have been widely used in many fields, including cyber security. However, machine learning systems are vulnerable to adversarial attacks, and this limits the ...
Adversarial machine learning
AISec '11: Proceedings of the 4th ACM workshop on Security and artificial intelligenceIn this paper (expanded from an invited talk at AISEC 2010), we discuss an emerging field of study: adversarial machine learning---the study of effective machine learning techniques against an adversarial opponent. In this paper, we: give a taxonomy for ...
Can machine learning be secure?
ASIACCS '06: Proceedings of the 2006 ACM Symposium on Information, computer and communications securityMachine learning systems offer unparalled flexibility in dealing with evolving input in a variety of applications, such as intrusion detection systems and spam e-mail filtering. However, machine learning algorithms themselves can be a target of attack ...
Comments