Identification of Potential Malicious Web Pages

Le, V. L., Welch, I., Gao, X. and Komisarczuk, P.

    Malicious web pages are an emerging security concern on the Internet due to their popularity and their potential serious impact. Detecting and analysing them are very costly because of their qualities and complexities. In this paper, we present a lightweight scoring mechanism that uses static features to identify potential malicious pages. This mechanism is intended as a filter that allows us to reduce the number suspicious web pages requiring more expensive analysis by other mechanisms that require loading and interpretation of the web pages to determine whether they are malicious or benign. Given its role as a filter, our main aim is to reduce false positives while minimising false negatives. The scoring mechanism has been developed by identifying candidate static features of malicious web pages that are evaluate using a feature selection algorithm. This identifies the most appropriate set of features that can be used to efficiently distinguish between benign and malicious web pages. These features are used to construct a scoring algorithm that allows us to calculate a score for a web page’s potential maliciousness. The main advantage of this scoring mechanism compared to a binary classifier is the ability to make a trade-off between accuracy and performance. This allows us to adjust the number of web pages passed to the more expensive analysis mechanism in order to tune overall performance.
Cite as: Le, V. L., Welch, I., Gao, X. and Komisarczuk, P. (2011). Identification of Potential Malicious Web Pages. In Proc. Australasian Information Security Conference (AISC 2011) Perth, Australia. CRPIT, 116. Colin Boyd and Josef Pieprzyk Eds., ACS. 33-40
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS