Unsolicited bulk email (spam) is used by cybercriminals to lure users into scams and to spread malware infections. Most of these unwanted messages are sent by spam botnets, which are networks of compromised machines under the control of a single (malicious) entity. Often, these botnets are rented out to particular groups to carry out spam campaigns, in which similar mail messages are sent to a large group of Internet users in a short amount of time. Tracking the bot-infected hosts that participate in spam campaigns, and attributing these hosts to spam botnets that are active on the Internet, are challenging but important tasks. In particular, this information can improve blacklist-based spam defenses and guide botnet mitigation efforts. In this paper, we present a novel technique to support the identification and tracking of bots that send spam. Our technique takes as input an initial set of IP addresses that are known to be associated with spam bots, and learns their spamming behavior. This initial set is then “magnified” by analyzing large-scale mail delivery logs to identify other hosts on the Internet whose behavior is similar to the behavior previously modeled. We implemented our technique in a tool, called BOTMAGNIFIER, and applied it to several data streams related to the delivery of email traffic. Our results show that it is possible to identify and track a substantial number of spam bots by using our magnification technique. We also perform attribution of the identified spam hosts and track the evolution and activity of well-known spamming botnets over time. Moreover, we show that our results can help to improve state-of-the-art spam blacklists.
@inproceedings{Stringhini2011BOTMAGNIFIER_Locating, title = {{BOTMAGNIFIER: Locating Spambots on the Internet}}, author = {Stringhini, Gianluca and Holz, Thorsten and Stone-Gross, Brett and Kruegel, Christopher and Vigna, Giovanni}, booktitle = {Proceedings of the 20st USENIX Conference on Security}, series = {USENIX Security}, year = {2011}, address = {Berkeley, CA, USA}, pages = {28--28}, publisher = {USENIX Association}, url = {http://dl.acm.org/citation.cfm?id=2028067.2028095} }