research-article

Advanced Phishing Filter Using Autoencoder and Denoising Autoencoder

Authors:

Bouabid El OuahidiAuthors Info & Claims

BDIOT '17: Proceedings of the International Conference on Big Data and Internet of Thing

Pages 125 - 129

https://doi.org/10.1145/3175684.3175690

Published: 20 December 2017 Publication History

Abstract

Phishing is referred as an attempt to obtain sensitive information, such as usernames, passwords, and credit card details (and, indirectly, money), for malicious reasons, by disguising as a trustworthy entity in an electronic communication [1]. Hackers and malicious users, often use Emails as phishing tools to obtain the personal data of legitimate users, by sending Emails with authentic identities, legitimate content, but also with malicious URL, which help them to steal consumer's data. The high dimensional data in phishing context contains large number of redundant features that significantly elevate the classification error. Additionally, the time required to perform classification increases with the number of features. So extracting complex Features from phishing Emails requires us to determine which Features are relevant and fundamental in phishing detection. The dominant approaches in phishing are based on machine learning techniques; these rely on manual feature engineering, which is time consuming. On the other hand, deep learning is a promising alternative to traditional methods. The main idea of deep learning techniques is to learn complex features extracted from data with minimum external contribution [2]. In this paper, we propose new phishing detection and prevention approach, based first on our previous spam filter [3] to classify textual content of Email. Secondly it's based on Autoencoder and on Denoising Autoencoder (DAE), to extract relevant and robust features set of URL (to which the website is actually directed), therefore the features space could be reduced considerably, and thus decreasing the phishing detection time.

References

[1]

Phishing attacks and countermeasures. Ramzan, Zulfikar (2010). 2010, Handbook of Information and Communication Security. Springer. ISBN 9783642041174.

[2]

Learning Deep Architectures for AI. Bengio, Yoshua. s.l.: Foundations and Trends® in Machine Learning: Vol. 2: No. 1, pp 1--127., 2009.

Digital Library

[3]

Towards A new Spam Filter Based on PV-DM (Paragraph Vector-Distributed Memory Approach), Samira Douzi, Meryem Amar, Bouabid El ouahidi, Hicham Laanaya. Science Direct,Procedia Computer Science Volume 110, 2017, Pages 486--491.

[4]

E. El-Alfy, R. Abdel-Aal,. Using GMDH-based networks for improved Spam detection and e-mail feature analysis. Applied Soft Computing 11 (1) (2011) 477--488.

Digital Library

[5]

Ian Fette, Norman Sadeh,Anthony Tomasic. Learning to Detect Phishing Emails. International World Wide Web Conference, 2007, pp. 649--656.

Digital Library

[6]

http://www.apwg.org/resources/apwg-reports/. Phishing Activity Trends Report 4 th Quarter 2016. s.l.: APWG.

[7]

Phishing Attacks: Analyzing Trends in 2006. Ramzan, Z., & Wüest, C. In Fourth conference on Email and Anti- Spam Mountain view: Citeseer, 2007.

[8]

Aaron, G. The state of phishing. Computer Fraud & Security. 2010 (6) (2010) 5--8.

[9]

S. Shivaji, E.J. Whitehead, R. Akella, K. Sunghun. Reducing features to improve bug prediction. IEEE/ACM International Conference on Automated Software Engineering (2009) 600--604. 2009.

Digital Library

[10]

El-Khatib, K. Impact of feature reduction on the ef?ciency of wireless intrusion detection systems. IEEE Transactions on Parallel and Distributed Systems 21 (8) (2010) 1143--1149.

Digital Library

[11]

LEARNING TO DETECT PHISHING URLs. al, Ram B. Basnet et. s.l.: International Journal of Research in Engineering and Technology, Jun-2014, Vol. Volume: 03 Issue: 06.

[12]

Sheng, S.,Wardman, B.,Warner, G., Cranor, L., Hong, J. and Zhang, C. An empirical analysis of phishing blacklists,. In Proceedings of the CEAS'09, 2009.

[13]

Detection of phishing attacks: a machine learning approach, Soft Computing Applications in Industry (2008) 373--383. R. Basnet, S. Mukkamala, A. Sung,.

[14]

Obtaining the threat model for e-mail phishing. Appl. Soft Comput. J. (2011),. C.K. Olivo, et al.

[15]

Online detection and prevention of phishing attacks. J. Chen, C. Guo,. s.l.: Communications and Networking in China (2006) 19--21.

[16]

Pro?ling phishing e-mails based on hyperlink information, International Conference on Advances in Social Networks Analysis and Mining (2010) 120--127. J. Yearwood, M. Mammadov, A. Banerjee,.

Digital Library

[17]

Analysis of Phishing Attacks and Countermeasures. Biju Issac, Raymond Chiong and Seibu Mary Jacob. s.l.: at www.arxiv.org., 2006.

[18]

Detecting Malicious URLs in E-mail- An Implementation,2013 AASRI Conference on Intelligent systems and control, Procedia 4 (2013) 125--131. Dhanalakshmi Ranganayakulu, Chellappan C.

[19]

Efficient prediction of phishing websites using supervised learning algorithms. Santhana Lakshmi V, Vijaya MS. s.l.: International Conference on Communication Technology and System Design 2011,Procedia Engineering 30 (2012) 798 -- 805.

[20]

Maher Aburrous, Hossain, M.A., KeshavDahal and FadiThabtah. "Experimental Case Studies for Investigating E-Banking Phishing Techniques and Attack Strategies. Cognitive Computing,Vol. 2, pp. 242--253. 2010.

[21]

D. Cook, V. Gurbani, M. Daniluk, Phishwish: a stateless phishing ?lter using. Phishwish: a stateless phishing ?lter using minimal rules, Lecture Notes in Computer Science (2008) 182--186.

Digital Library

[22]

CANTINA: a content-based approach to detecting phishing web sites. Y. Zhang, J. Hong, L. Cranor. s.l.: In Proc. 16th Int. Conf. World Wide Web, WWW‟07 Banff, Alberta, Canada, 2007, pp. 639--648.

Digital Library

[23]

Representation Learning via Semi-supervised Autoencoder for Multi-task Learning. al., Fuzhen Zhuang at. s.l.: EEE International Conference on Data Mining, 2015.

Digital Library

[24]

Unsupervised Feature Extraction with Autoencoder Trees. Ozan úIrsoy, Ethem Alpaydõn. s.l.: Neurocomputing (2017).

[25]

A novel deep autoencoder feature learning method for rotating machinery fault diagnosis. Shao Haidong, Jiang Hongkai,Zhao Huiwei, Wang Fuan. s.l.: Mechanical Systems and Signal Processing 95 (2017) 187--204.

[26]

Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition. Jun Deng, Student Member, IEEE, Zixing Zhang, Florian Eyben, Member, IEEE, and Björn Schuller, Member, IEEE. s.l.: IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 9, SEPTEMBER 2014.

[27]

Extracting and Composing Robust Features with Denoising Autoencoders. al, Pascal Vincent et. s.l.: Proceedings of the 25 International Conference ence on Machine Learning, Helsinki, Finland, 2008.

Digital Library

Cited By

Remmide MBoumahdi FBoustia N(2024)Toward a Hybrid Approach Combining Deep Learning and Case-Based Reasoning for Phishing Email DetectionInternational Journal on Artificial Intelligence Tools10.1142/S0218213024500155Online publication date: 28-Jun-2024
https://doi.org/10.1142/S0218213024500155
Sakazi IGrolman EElovici YShabtai A(2024)STFL: Utilizing a Semi-Supervised, Transfer-Learning, Federated-Learning Approach to Detect Phishing URL Attacks2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650184(1-10)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650184
Pejić-Bach MJajić IKamenjarska T(2023)A Bibliometric Analysis of Phishing in the Big Data EraProcedia Computer Science10.1016/j.procs.2023.01.268219:C(91-98)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.procs.2023.01.268
Show More Cited By

Index Terms

Advanced Phishing Filter Using Autoencoder and Denoising Autoencoder
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
  2. Machine learning

Recommendations

How Experts Detect Phishing Scam Emails
CSCW

Phishing scam emails are emails that pretend to be something they are not in order to get the recipient of the email to undertake some action they normally would not. While technical protections against phishing reduce the number of phishing emails ...
A Sender-Centric Approach to Detecting Phishing Emails
CYBERSECURITY '12: Proceedings of the 2012 International Conference on Cyber Security

Email-based online phishing is a critical security threat on the Internet. Although phishers have great flexibility in manipulating both the content and structure of phishing emails, phishers have much less flexibility in completely concealing the ...
Status Update on Phishing Emails Awareness: Jordanian Case
ICEMIS'21: The 7th International Conference on Engineering & MIS 2021

Abstract—This study is a response to the rapid proliferation of high-risk phishing emails, representing one of the most dangerous cybercrimes and the primary medium for the deception of online users. This study aims to investigate the various ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

BDIOT '17: Proceedings of the International Conference on Big Data and Internet of Thing

December 2017

251 pages

ISBN:9781450354301

DOI:10.1145/3175684

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

BDIOT2017

BDIOT2017: International Conference on Big Data and Internet of Thing

December 20 - 22, 2017

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 75 of 136 submissions, 55%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
324
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Remmide MBoumahdi FBoustia N(2024)Toward a Hybrid Approach Combining Deep Learning and Case-Based Reasoning for Phishing Email DetectionInternational Journal on Artificial Intelligence Tools10.1142/S0218213024500155Online publication date: 28-Jun-2024
https://doi.org/10.1142/S0218213024500155
Sakazi IGrolman EElovici YShabtai A(2024)STFL: Utilizing a Semi-Supervised, Transfer-Learning, Federated-Learning Approach to Detect Phishing URL Attacks2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650184(1-10)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650184
Pejić-Bach MJajić IKamenjarska T(2023)A Bibliometric Analysis of Phishing in the Big Data EraProcedia Computer Science10.1016/j.procs.2023.01.268219:C(91-98)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.procs.2023.01.268
Gopal SPoongodi CNanthiya DKirubakaran TKulavishnusaravanan BLogeshwar D(2023)Autoencoder-Based Architecture for Identification and Mitigating Phishing URL Attack in IoT Using DNNJournal of The Institution of Engineers (India): Series B10.1007/s40031-023-00934-8104:6(1227-1240)Online publication date: 31-Oct-2023
https://doi.org/10.1007/s40031-023-00934-8
Remmide MBoumahdi FBoustia N(2022)Phishing Email Detection Using Bi-GRU-CNN ModelProceedings of the International Conference on Applied CyberSecurity (ACS) 202110.1007/978-3-030-95918-0_8(71-77)Online publication date: 2-Feb-2022
https://doi.org/10.1007/978-3-030-95918-0_8
Hernández Dominguez ABaluja García W(2021)Updated Analysis of Detection Methods for Phishing AttacksFuturistic Trends in Network and Communication Technologies10.1007/978-981-16-1480-4_5(56-67)Online publication date: 31-Mar-2021
https://doi.org/10.1007/978-981-16-1480-4_5
Feng JZou LYe OHan J(2020)Web2Vec: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep LearningIEEE Access10.1109/ACCESS.2020.30431888(221214-221224)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3043188
Karim AAzam SShanmugam BKannoorpatti K(2020)Efficient Clustering of Emails Into Spam and Ham: The Foundational Study of a Comprehensive Unsupervised FrameworkIEEE Access10.1109/ACCESS.2020.30170828(154759-154788)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3017082
Amar MEL Ouahidi B(2020)A Weighted LSTM Deep Learning for Intrusion DetectionAdvanced Communication Systems and Information Security10.1007/978-3-030-61143-9_14(170-179)Online publication date: 6-Nov-2020
https://doi.org/10.1007/978-3-030-61143-9_14
Benchaji IDouzi SEl Ouahidi B(2019)Using Genetic Algorithm to Improve Classification of Imbalanced Datasets for Credit Card Fraud DetectionSmart Data and Computational Intelligence10.1007/978-3-030-11914-0_24(220-229)Online publication date: 1-Mar-2019
https://doi.org/10.1007/978-3-030-11914-0_24

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten