Sinkholes and Internet Hygiene

Published on 13th September 2018, 12:52:49 UTC

Keeping the Internet hygiene good can be challenging. There is a lot of badness around, harming not only internet users, organisations or corporate networks but also services that rely on the internet and sometimes even the integrity and stability of the internet itself. It is therefore essential to keep a certain level of internet hygiene. Among other things, internet services providers (ISPs) and national computer emergency response teams (CERTs) try to achieve that by collecting information about infected computers (so-called "bots") in order to notify the associated broadband subscriber or network owner about compromised machines.

Questions, more Questions, and Walled Gardens

The first question that usually comes up when contacting a broadband subscribers or network owner about an infected machine is: "how would you know that I'm infected?". The question is then followed by concerns that you are spying on their network: "Are you spying on me?!". Instead of of trying to locate and re-mediate the malware infected machine, you will have to spend a lot of time explaining the victim:

  • That you are not fake, not a gangster nor do you want to extort any money from her/him
  • How you have obtained the information about the compromise
  • That you are not spying on their network
  • That the she/he really has a problem and should locate and clean up the infected machine

If you have hundreds of broadband subscribers or network owners you need to contact, this can be challenging and very time consuming. Some ISPs therefore rely on so-called "walled gardens" (sometimes also referenced as "sandbox") where the broadband subscriber gets in if abuse or an infection gets reported for his internet line. The goal of a "walled garden" is to shield the affected (or better said: infected) subscriber from the rest of the internet, avoiding that other internet users get harmed. The ISP then usually provides the subscriber certain tools, inlcuding Anti-virus software, to re-mediate the infection. Once the infection has been cleaned, internet access gets restored. In short: walled gardening can be an easy and automated way to handle infected broadband subscribers at large scale.

However, the question that remains is: How would an ISP know which customers are infected?

Fighting Bots with Sinkholes

Sinkholes play an important role when it comes to fighting malware and botnets. A sinkhole is a technique used by internet services providers, security researchers, IT-security industry and law enforcement agencies to collect information about infected machines (e.g. number of infected machines, geographically location/hotspots). However, the technique can also been used to dismantle botnets like some security researcher and law enforcement agencies did in the past (see section further reading).

In order to sinkhole a botnet, security researcher must first identify the botnet command&control infrastructure (C&Cs) used by the malware. These are usually domain names or IP addresses and include fall-back / backup mechanisms the malware might have (such as DGAs or simple, hard-coded fall-back C&C IPs or domains). Once identified, security researchers have two possibilities:

  • Take down ("nuke") the botnet infrastructure and register any fall-back domain names that are not registered by the miscreants yet
  • Take over the domain names that have already being registered by the miscreants and register any fall-back domain names that are not registered by the miscreants yet

While the second option is the most effective one in terms of impact, it is also the most challenging one. Once a domain name has been registered, there is no way to take it over. You would have to wait until the domain name expires and then register it again on your behalf. However, a court could force a domain registrar or a domain registry to handle over certain domain names to law enforcement. The problem here is that the internet is global and registrars as well as registries may be located abroad and hence belong to a different jurisdiction. As an example: You want to go after a botnet that uses domain names from the TLDs .com, .org, .ru and .de. For .com and .org you would have to involve law enforcement from the US, for .ru from Russian and for .de from Germany.

Once security researcher have obtained a botnet C&C domain (either by registering it by themselves or by taking it over), they point it to a special server: the so-called Sinkhole. This sinkhole server runs a special piece of software that usually records:

  • Timestamp (including timezone)
  • The connecting IP address (aka remote IP address aka IP address of the victims machine)
  • The remote port (aka client port)
  • The local port (aka server port)

Depending on the protocol used by the malware, it is useful to record additional information that would help victims that are behind a NAT (e.g. corporate networks) to identify the compromised machine. For HTTP, such additional information could be:

  • HTTP host header (e.g. domain name)
  • HTTP user-agent

The information above should be sufficient for the victim to locate and identify the infected machine even in large environments such as corporate networks.

Once a bot connects to the sinkhole, the corresponding connection information gets stored.

Victim Data & GDPR

Operating a sinkhole does not only come with a handful technical difficulties but also some legal aspects that needs to be considered. Infected computers often post stolen data from the victims computer to the botnet C&C servers, or in this case, to the sinkhole. Such data can be personally identifiable information (PII), screenshots, keystrokes, credentials such as username or password, email addresses or even credit card information. While most malware families encrypt the botnet C&C communication these days, there are still some malware families that send stolen data unencrypted over the wire (and hence towards your sinkhole). Also, some malware families do use a weak encryption and hence sinkhole operators may even be able to decrypt the botnet traffic. While the data posted by the malware towards the sinkhole may help security researchers to identify and notify the victim, it might be PII.

Another problem that recently came up is the EU data protection law called GDPR which makes operating a sinkhole much more difficult. There is also a lot of discussion going on about whether an IP address is PII or not. In worst case, operating a sinkhole may be not compliant with (local) regulations and hence become illegal.


Many sinkholes are operated by security researchers or law enforcement agencies for non-profit. However, there are also some IT security vendors that operate sinkholes and sell the data to their customers (there are various business models, e.g. you pay a certain amount of money per year for getting notified when an infected machine appears in your network).

In the past 6 years, has sinkholed more than 50'000 domain names and helped to identify and re-mediate millions of infected computers world wide. At the moment, the sinkhole handles over 20'000 domain names and up to 1'000 HTTP requests per second. To deliver information about infected machines to network owners and national CERTs, partners with Shadowserver and Spamhaus.

If you are a network owner or a national CERT, you can get reports about infected machines in your constituency for free from Shadowserver and Spamhaus:

Further reading

Blog Archive