Detection of phishing emails using data mining algorithms

Sami Smadi, Nauman Aslam, Li Zhang, Rafe Alasem, Alamgir Hossain

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

25 Citations (Scopus)

Abstract

This paper proposes an intelligent model for detection of phishing emails which depends on a preprocessing phase that extracts a set of features concerning different email parts. The extracted features are classified using the J48 classification algorithm. We experimented with a total of 23 features that have been used in the literature. Ten-fold cross-validation was applied for training, testing and validation. The primary focus of this paper is to enhance the overall metrics values of email classification by focusing on the preprocessing phase and determine the best algorithm that can be used in this field. The results show the benefits of using our preprocessing phase to extract features from the dataset. The model achieved 98.87% accuracy for the random forest algorithm, which is the highest registered so far for an approved dataset. A comparison of ten different classification algorithms demonstrates their merits and capabilities through a set of experiments.
Original languageEnglish
Title of host publicationProceedings of the 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2015)
Place of PublicationPiscataway, NJ
PublisherIEEE
Pages1-8
ISBN (Print)9781467367448
DOIs
Publication statusPublished - Dec 2015
Event9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2015) - Kathmandu
Duration: 1 Dec 2015 → …

Conference

Conference9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2015)
Period1/12/15 → …

Fingerprint

Dive into the research topics of 'Detection of phishing emails using data mining algorithms'. Together they form a unique fingerprint.

Cite this