Based on URL Feature Extraction Identify Malicious Website Using Machine Learning Techniques

Khushbu Digesh Vara; Vaibhav Sudhir Dimble; Mansi Mohan Yadav; Aarti Ashok Thorat

doi:https://doi.org/10.47001/IRJIET/2022.603019

Based on URL Feature Extraction Identify Malicious Website Using Machine Learning Techniques

Abstract

A phishing attack is the simplest way to obtain sensitive information from users. The aim of the phishers is to acquire critical information like username, password, bank account details and other personal information. With the development of Internet technology, network security is under different threats. Especially attackers can spread malicious uniform resource locators (URLs) to carry out attacks such as phishing and spam. The research on malicious URL detection is significant for defending against this attack. Some existing detection methods are easy to cover by attackers. We design a malicious URL detection model based on Machine Learning Techniques to solve these problems. Cyber security persons are now looking for reliable and stable detection techniques for phishing websites detection. This propose system deals with machine learning technology for the detection of phishing URLs by extracting and analysing various feature of legitimate and phishing URLs. Decision Trees, random forest and support vector machine algorithms are used to detect phishing websites or unsecure websites. The aim of the paper is to detect phishing URLs as well as cut down to the best machine learning algorithm by comparing the accuracy rate, false positive and false negative rate of each algorithm. This paper analyses the structural feature of the URL of the Phishing websites extracts 12 kinds of features and uses four machine learning algorithms for training and use the best-performing algorithm as our model to identify unknown URLs.

Country : India

¹ Khushbu Digesh Vara² Vaibhav Sudhir Dimble³ Mansi Mohan Yadav⁴ Aarti Ashok Thorat

Navsahyadri Education Society’s Group of Institutions, Pune, India
Navsahyadri Education Society’s Group of Institutions, Pune, India
Navsahyadri Education Society’s Group of Institutions, Pune, India
Navsahyadri Education Society’s Group of Institutions, Pune, India

IRJIET, Volume 6, Issue 3, March 2022 pp. 144-148

doi.org/10.47001/IRJIET/2022.603019

Full Paper
Download

References

Justin. Ma, Lawrence. K. Saul, S. Savage, and G. M. Voelker, “Beyond blacklists: learning to detect malicious websites from suspicious URLs,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM, 2009, pp. 1245–1254.
Mohammed Al-Janabi, Ed de Quincey, Peter Andras, “Using Supervised Machine Learning Algorithms to Detect suspicious URLs in online social networks” Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017.
Pde las Cuevas, Z. Chelly, A. Mora, J. Merelo, and A. Esparcia Alcazar, “An improved decision system for URL accesses based on a rough feature selection technique,” in Recent Advances in Computational Intelligence in Defense and Security. Springer, 2016, pp. 139–167.
A.Mora, P. De las Cuevas, and J. Merelo, “Going a step beyond the black and white lists for URL access in the enterprise by means of categorical classifiers,” ECTA, pp. 125–134, 2014.
M.-Y. Kan and H. O. N. Thi, “Fast webpage classification using URL features,” in Proceedings of the 14th ACM international conference on information and knowledge management. ACM, 2005, pp. 325–326.
E. Bayan, M. Henninger, L. Marian, and I. Weber, “Purely URL-based topic classification,” in Proceedings of the 18th international conference on World wide web. ACM, 2009, pp. 1109–1110.
J. Ma, L. K. Saul, S. Savage, and G. M. Voelker, “Beyond blacklists: learning to detect malicious web sites from suspicious URLs,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009, pp. 1245–1254.
“Learning to detect malicious URLs,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no. 3, p. 30, 2011.
P. Zhao and S. C. Hoi, “Cost-sensitive online active learning with application to malicious URL detection,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013, pp. 919–927.
Y. Zhang, J. I. Hong, and L. F. Cranor, “Cantina, a content-based approach to detecting phishing web sites” Proceedings of the 16th international conference on World Wide Web - WWW 07, pp. 639-648, 2007.

International Research Journal of Innovations in Engineering and Technology - IRJIET International Open Access, Monthly, Peer Reviewed, Reputed Journal ISSN (online): 2581-3048

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

Based on URL Feature Extraction Identify Malicious Website Using Machine Learning Techniques

Abstract

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links