research-article

TTPDrill: Automatic and Accurate Extraction of Threat Actions from Unstructured Text of CTI Sources

Authors:

Mohiuddin Ahmed,

Xi NiuAuthors Info & Claims

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications Conference

Pages 103 - 115

https://doi.org/10.1145/3134600.3134646

Published: 04 December 2017 Publication History

Abstract

With the rapid growth of the cyber attacks, sharing of cyber threat intelligence (CTI) becomes essential to identify and respond to cyber attack in timely and cost-effective manner. However, with the lack of standard languages and automated analytics of cyber threat information, analyzing complex and unstructured text of CTI reports is extremely time- and labor-consuming. Without addressing this challenge, CTI sharing will be highly impractical, and attack uncertainty and time-to-defend will continue to increase.

Considering the high volume and speed of CTI sharing, our aim in this paper is to develop automated and context-aware analytics of cyber threat intelligence to accurately learn attack pattern (TTPs) from commonly available CTI sources in order to timely implement cyber defense actions. Our paper has three key contributions. First, it presents a novel threat-action ontology that is sufficiently rich to understand the specifications and context of malicious actions. Second, we developed a novel text mining approach that combines enhanced techniques of Natural Language Processing (NLP) and Information retrieval (IR) to extract threat actions based on semantic (rather than syntactic) relationship. Third, our CTI analysis can construct a complete attack pattern by mapping each threat action to the appropriate techniques, tactics and kill chain phases, and translating it any threat sharing standards, such as STIX 2.1. Our CTI analytic techniques were implemented in a tool, called TTPDrill, and evaluated using a randomly selected set of Symantec Threat Reports. Our evaluation tests show that TTPDrill achieves more than 82% of precision and recall in a variety of measures, very reasonable for this problem domain.

References

[1]

S Barnum. 2008. Common attack pattern enumeration and classification (capec) schema description. Cigital Inc, http://capec.mitre.org/documents/documentation/CAPEC_Schema_Descriptiori_v1 3 (2008).

[2]

Sean Barnum. 2012. Standardizing cyber threat intelligence information with the Structured Threat Information eXpression (STIX. MITRE Corporation 11 (2012).

[3]

CleanMX. 2006. Public Access Query for URL. (2006). http://support.clean-mx.com/clean-mx/viruses.php

[4]

Symantec Corp. 1995. Symantec Security Center. (1995). https://www.symantec.com/security_response/

[5]

Doug Cutting, Julian Kupiec, Jan Pedersen, and Penelope Sibun. 1992. A practical part-of-speech tagger. In Proceedings of the third conference on Applied natural language processing. Association for Computational Linguistics, 133--140.

Digital Library

[6]

Marie-Catherine De Marneffe and Christopher D Manning. 2008. The Stanford typed dependencies representation. In Coling 2008: proceedings of the workshop on cross-framework and cross-domain parser evaluation. Association for Computational Linguistics, 1--8.

Digital Library

[7]

Dibnet. 2017. Defense Industrial Base Cybersecurity Information Sharing Program. (2017). http://dibnet.dod.mil/

[8]

Dictionary.com. 2016. Thesaurus. http://www.thesaurus.com/. (2016).

[9]

Malware don't need Coffee. 2012. (2012). http://malware.dontneedcoffee.com/

[10]

Facebook. 2017. ThreatExchange. (2017). https://developers.facebook.com/products/threat-exchange

[11]

Google. 2017. Natural Language API. (2017). https://cloud.google.com/natural-language/

[12]

Eric M Hutchins, Michael J Cloppert, and Rohan M Amin. 2011. Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains. Leading Issues in Information Warfare & Security Research 1 (2011), 80.

[13]

V. Igure and R. Williams. 2008. Taxonomies of Attacks and Vulnerabilities in Computer Systems. Commun. Surveys Tuts. 10, 1 (Jan. 2008), 6--19.

Digital Library

[14]

Xiaojing Liao, Kan Yuan, XiaoFeng Wang, Zhou Li, Luyi Xing, and Raheem Beyah. 2016. Acing the IOC Game: Toward Automatic Discovery and Analysis of Open-Source Cyber Threat Intelligence. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS '16). ACM, New York, NY, USA, 755--766.

Digital Library

[15]

MANDIANT. 2011. The OpenIOC Framework. (2011). http://www.openioc.org

[16]

Carol Meyers, Sarah Powers, and Daniel Faissol. 2009. Taxonomies of cyber adversaries and attacks: a survey of incidents and approaches. Lawrence Livermore National Laboratory (April 2009) 7 (2009), 1--22.

[17]

George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39--41.

Digital Library

[18]

MITRE. 2014. Adversarial Tactics, Techniques &Common Knowledge (ATT&CK). (2014). https://attack.mitre.org

[19]

MITRE. 2017. Standardizing cyber threat intelligence information with the Structured Threat Information eXpression (STIX) Version 2.1. (2017). https://oasis-open.github.io/cti-documentation/

[20]

Natalya F Noy, Deborah L McGuinness, et al. 2001. Ontology development 101: A guide to creating your first ontology. (2001).

[21]

Leo Obrst, Penny Chase, and Richard Markeloff. 2012. Developing an Ontology of the Cyber Security Domain. In STIDS. 49--56.

[22]

OpenDNS. 2017. PhishTank. (2017). https://www.phishtank.com/

[23]

Rahul Pandita, Xusheng Xiao, Wei Yang, William Enck, and Tao Xie. 2013. WHYPER: Towards Automating Risk Assessment of Mobile Applications. In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13). USENIX, Washington, D.C., 527--542. https://www.usenix.org/conference/usenixsecurity13/technical-sessions/presentation/pandita

Digital Library

[24]

Zhengyang Qu, Vaibhav Rastogi, Xinyi Zhang, Yan Chen, Tiantian Zhu, and Zhong Chen. 2014. AutoCog: Measuring the Description-to-permission Fidelity in Android Applications. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS '14). ACM, New York, NY, USA, 1354--1365.

Digital Library

[25]

Stephen E Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval. Springer-Verlag New York, Inc., 232--241.

Digital Library

[26]

Carl Sabottke, Octavian Suciu, and Tudor Dumitras. 2015. Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits. In 24th USENIX Security Symposium (USENIX Security 15). USENIX Association, Washington, D.C., 1041--1056. https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/sabottke

Digital Library

[27]

Mark Steedman. 2017. Combinatory Categorial Grammar Parser. (2017). http://groups.inf.ed.ac.uk/ccg/

[28]

Mervyn Stone. 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the royal statistical society. Series B (Methodological) (1974), 111--147.

[29]

VirusTotal. 2014. Yara. (2014). http://plusvic.github.io/yara/

[30]

Watson. 2017. Watson Synonym Service. (2017). http://watson.kmi.open.ac.uk/API/explain-syn.html

[31]

Ziyun Zhu and Tudor Dumitras. 2016. FeatureSmith: Automatically Engineering Features for Malware Detection by Mining the Security Literature. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS '16). ACM, New York, NY, USA, 767--778.

Digital Library

[32]

Sebastian Zimmeck and Steven M. Bellovin. 2014. Privee: An Architecture for Automatically Analyzing Web Privacy Policies. In 23rd USENIX Security Symposium (USENIX Security 14). USENIX Association, San Diego, CA, 1--16. https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/zimmeck

Digital Library

Cited By

Han YJiang RLi CHuang YChen KYu HLi AHan WPang SZhao X(2025)AT4CTIRE: Adversarial Training for Cyber Threat Intelligence Relation ExtractionElectronics10.3390/electronics1402032414:2(324)Online publication date: 15-Jan-2025
https://doi.org/10.3390/electronics14020324
Zhang SXue XSu X(2025)DeepOP: A Hybrid Framework for MITRE ATT&CK Sequence Prediction via Deep Learning and OntologyElectronics10.3390/electronics1402025714:2(257)Online publication date: 9-Jan-2025
https://doi.org/10.3390/electronics14020257
Daniel NKaiser FGiladi SSharabi SMoyal RShpolyansky SMurillo AElyashar APuzis R(2025)Labeling Network Intrusion Detection System (NIDS) Rules with MITRE ATT&CK Techniques: Machine Learning vs. Large Language ModelsBig Data and Cognitive Computing10.3390/bdcc90200239:2(23)Online publication date: 26-Jan-2025
https://doi.org/10.3390/bdcc9020023
Show More Cited By

TTPDrill: Automatic and Accurate Extraction of Threat Actions from Unstructured Text of CTI Sources
1. Social and professional topics
  1. Computing / technology policy

Recommendations

On the feasibility of launching the man-in-the-middle attacks on VoIP from remote attackers
ASIACCS '09: Proceedings of the 4th International Symposium on Information, Computer, and Communications Security

The man-in-the-middle (MITM) attack has been shown to be one of the most serious threats to the security and trust of existing VoIP protocols and systems. For example, the MITM who is in the VoIP signaling and/or media path can easily wiretap, divert ...
D-ward: source-end defense against distributed denial-of-service attacks
Detecting Insider Theft of Trade Secrets

Trusted insiders who misuse their privileges to gather and steal sensitive information represent a potent threat to businesses. Applying access controls to protect sensitive information can reduce the threat but has significant limitations. Even if ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications Conference

December 2017

618 pages

ISBN:9781450353458

DOI:10.1145/3134600

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

ACSAC 2017

ACSAC 2017: 2017 Annual Computer Security Applications Conference

December 4 - 8, 2017

FL, Orlando, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

126
Total Citations
View Citations
2,474
Total Downloads

Downloads (Last 12 months)422
Downloads (Last 6 weeks)31

Reflects downloads up to 22 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Han YJiang RLi CHuang YChen KYu HLi AHan WPang SZhao X(2025)AT4CTIRE: Adversarial Training for Cyber Threat Intelligence Relation ExtractionElectronics10.3390/electronics1402032414:2(324)Online publication date: 15-Jan-2025
https://doi.org/10.3390/electronics14020324
Zhang SXue XSu X(2025)DeepOP: A Hybrid Framework for MITRE ATT&CK Sequence Prediction via Deep Learning and OntologyElectronics10.3390/electronics1402025714:2(257)Online publication date: 9-Jan-2025
https://doi.org/10.3390/electronics14020257
Daniel NKaiser FGiladi SSharabi SMoyal RShpolyansky SMurillo AElyashar APuzis R(2025)Labeling Network Intrusion Detection System (NIDS) Rules with MITRE ATT&CK Techniques: Machine Learning vs. Large Language ModelsBig Data and Cognitive Computing10.3390/bdcc90200239:2(23)Online publication date: 26-Jan-2025
https://doi.org/10.3390/bdcc9020023
Song YWang KSun XQin ZDai HChen WLv BChen J(2025)A multi-source log semantic analysis-based attack investigation approachComputers & Security10.1016/j.cose.2024.104303150(104303)Online publication date: Mar-2025
https://doi.org/10.1016/j.cose.2024.104303
Chen MZhu KLu BLi DYuan QZhu Y(2025)AECR: Automatic attack technique intelligence extraction based on fine-tuned large language modelComputers & Security10.1016/j.cose.2024.104213150(104213)Online publication date: Mar-2025
https://doi.org/10.1016/j.cose.2024.104213
Jia JYang LWang YSang A(2025)Hyper attack graphComputers and Security10.1016/j.cose.2024.104194149:COnline publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.cose.2024.104194
Ma CJiang ZZhang KLing ZJiang JYou YYang PFeng H(2025)TIMFuser: A multi-granular fusion framework for cyber threat intelligenceComputers & Security10.1016/j.cose.2024.104141148(104141)Online publication date: Jan-2025
https://doi.org/10.1016/j.cose.2024.104141
Mai KLee JBeuran RHotchi ROoi SKuroda TTan Y(2025)RAF-AGComputers and Security10.1016/j.cose.2024.104125148:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.cose.2024.104125
Abdeen BAl-Shaer ESinghal AKhan LHamlen K(2024)SMET: Semantic mapping of CTI reports and CVE to ATT&CK for advanced threat intelligenceJournal of Computer Security10.3233/JCS-230218(1-20)Online publication date: 28-Jun-2024
https://doi.org/10.3233/JCS-230218
Li ZYu XWei TQian J(2024)Unstructured Big Data Threat Intelligence Parallel Mining AlgorithmBig Data Mining and Analytics10.26599/BDMA.2023.90200327:2(531-546)Online publication date: Jun-2024
https://doi.org/10.26599/BDMA.2023.9020032
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten