Identifying malicious web sites has become a major challenge in today’s Internet. Previous work focused on detecting if a web site is malicious by dynamically executing JavaScript in instrumented environments or by rendering web sites in client honeypots. Both techniques bear a significant evaluation overhead, since the analysis can take up to tens of seconds or even minutes per sample. In this paper, we introduce a novel, purely static analysis approach, the Delta-system, that (i) extracts change-related features between two versions of the same website, (ii) uses a machine-learning algorithm to derive a model of web site changes, (iii) detects if a change was malicious or benign, (iv) identifies the underlying infection vector campaign based on clustering, and (iv) generates an identifying signature. We demonstrate the effectiveness of the Delta-system by evaluating it on a dataset of over 26 million pairs of web sites by running next to a web crawler for a period of four months. Over this time span, the Delta-system successfully identified previously unknown infection campaigns. Including a campaign that targeted installations of the Discuz!X Internet forum software by injecting infection vectors into these forums and redirecting forum readers to an installation of the Cool Exploit Kit.
@inproceedings{Borgolte2013Delta_Automatic, title = {{Delta: Automatic Identification of Unknown Web-based Infection Campaigns}}, author = {Borgolte, Kevin and Kruegel, Christopher and Vigna, Giovanni}, booktitle = {Proceedings of the 20th ACM Conference on Computer and Communications Security}, series = {CCS}, month = {November}, year = {2013}, publisher = {ACM} }