This paper proposes an intelligent model for detection of phishing emails which depends on a preprocessing phase that extracts a set of features concerning different email parts. The extracted features are classified using the J48 classification algorithm. We experimented with a total of 23 features that have been used in the literature. Ten-fold cross-validation was applied for training, testing and validation. The primary focus of this paper is to enhance the overall metrics values of email classification by focusing on the preprocessing phase and determine the best algorithm that can be used in this field. The results show the benefits of using our preprocessing phase to extract features from the dataset. The model achieved 98.87% accuracy for the random forest algorithm, which is the highest registered so far for an approved dataset. A comparison of ten different classification algorithms demonstrates their merits and capabilities through a set of experiments.
|Title of host publication||Proceedings of the 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2015)|
|Place of Publication||Piscataway, NJ|
|Publication status||Published - Dec 2015|
|Event||9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2015) - Kathmandu|
Duration: 1 Dec 2015 → …
|Conference||9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2015)|
|Period||1/12/15 → …|