Who’s Scanning the IPv6 Space? And, Frankly, Why Do We Even Care?
Executive summary
Akamai researchers in collaboration with the Max Planck Institute for Informatics have completed the first empirical study of large-scale vulnerability scanning in the IPv6 space and have compiled their findings.
The findings confirm that scanners have started sweeping the IPv6 space. Scanning the IPv6 space is vastly more complicated than scanning IPv4, and thus scanning activity in IPv6 is still rare when compared with the scanning activity in IPv4.
Scan traffic is heavily concentrated among a small number of active sources; geographically, the two most active sources are scanning from China. Looking at the top networks sourcing scan traffic, there are also scans conducted by cybersecurity companies and scans executed from cloud provider address space.
Scanners often do not use a single 128-bit IPv6 address to source scan traffic, but use myriad source addresses from prefixes as large as entire /32s to source their probes, likely to evade detection. This makes detection and blocking of scanners exceedingly difficult.
Reliable IPv6 scan detection and blocking will present a looming challenge if and when vulnerability scanning of the IPv6 space becomes more commonplace.
Introduction
Attackers, researchers, and defenders alike have often utilized scanning as a way to find/expose vulnerabilities in internet-connected devices. Although their goals are wildly different, with tens of thousands of continuous sources in IPv4, scanning is one of the cornerstones of anyone looking for or defending against cyberthreats. When new vulnerabilities in software are found, attackers race to scan the address space to find vulnerable hosts for exploitation or infection. At the same time, botnets continuously scan the address space to find new exploitable targets for lateral spreading. Security companies and researchers scan the address space to study its properties and identify weak spots. In short, the IPv4 space is busy.
But what about IPv6? Is anybody scanning the IPv6 space? Should we even be focusing on it at all?
The answer is simple: We can no longer ignore IPv6 from a security perspective. For our new paper, which we’ll present at the ACM Internet Measurement Conference 2022, we used the firewall logs captured at Akamai’s edge servers to illuminate scanning activity in the IPv6 internet. Over 15 months we studied who scans, what they scan for, and what the key challenges are when it comes to identifying and blocking such scans.
Why is scanning the IPv6 space difficult?
The IPv4 space consists of some 4 billion addresses, and the entire space can nowadays be scanned in less than one hour from a high-bandwidth machine. Similarly, low-bandwidth IoT bots just generate and scan random IPv4 addresses for lateral spreading. Randomly generating target addresses is typically sufficient to find a responsive, and exploitable, host, given enough time.
IPv6, on the other hand, with its 128-bit address length, yields some ten-to-the-power-of-38 addresses, which is many orders of magnitude larger than the number of addresses in the IPv4 space. The entirety of IPv6 would require at least trillions of years to scan with today’s technology capabilities. This makes full scanning impossible in this space, so if an attacker wants to scan, they need to find other ways of finding target addresses — for example, by using hitlists of IPv6 addresses, or by using DNS to resolve domain names to IPv6 addresses. Although the vastness of IPv6 makes it difficult for attackers to utilize random scanning to find targets, this does not mean it can be ignored. This would be akin to the “security by obscurity” ideology, with its recommendations to “secure” IPv6 by simply “blocking it.”
Why is detecting IPv6 scans difficult?
In the IPv4 space, every routed address receives thousands of scanning packets every day – a result of actors scanning the entire IPv4 space, or randomly generating target addresses to find vulnerable hosts. Thus, monitoring arriving scanning traffic in unused prefixes (“darknets”) presents us with a straightforward way to study overall scanning trends in the IPv4 space.
When it comes to IPv6, however, the first challenge is to find vantage points that can see sufficient levels of scan traffic. As mentioned above, randomly targeting the full IPv6 space is not practical. As a result, monitoring traffic on unused IPv6 prefixes (darknets, which are common for IPv4) will hardly reveal any scanning traffic. This is where the Akamai Intelligent Edge Platform comes into play. We operate hundreds of thousands of machines in more than 1,300 networks. Each server hosts the content of our customers, and since we return their addresses as responses to DNS requests, the IPv6 addresses of our machines are exposed to the internet. Therefore, they can be found by scanners and are part of publicly available IPv6 hitlists.
The second challenge is to pinpoint and isolate IPv6 scan sources. Since both attackers and legitimate scanning sources are actively scanning, isolation is important to understand the scanning landscape as it stands today. Moreover, isolation will be key in production environments if a network wants to block traffic from a malicious actor. Studying scan traffic, we noticed that some scanners do not use a single 128-bit source IPv6 address to send their scan probes, but instead source their traffic from myriad source addresses in less-specific prefixes such as /64s, /48s, or even /32s. Again, the size of IPv6 makes it possible; a scanner can easily obtain a large IPv6 allocation (e.g., by obtaining a /32 directly from the Regional Internet Registries (RIRs), or even by simply subscribing to internet service providers, some of which allocate an entire /48 prefix per subscriber). We find that spreading traffic across prefixes may be used to avoid detection, since every source address may only emit a single packet. Other scanners seem to use the available bits in IPv6 addresses to encode scan information.
How do we detect large-scale IPv6 scans?
Akamai’s firewall logs unsolicited traffic sent to our edge machines, including scan traffic. For scan detection, we require that a given source targets at least 100 destination IP addresses of our machines, similar to how we detected large-scale IPv4 scans in earlier work.
As mentioned above, since some scanners use larger prefixes from which to source their traffic, we show scanners that we detect when not aggregating traffic (/128 sources), when first aggregating all packets from a source /64 together, as well as when aggregating all packets from a /48 source together.
IPv6 scanning trends in 2021 and 2022
Fig. 1: Weekly detected IPv6 scan sources for different source aggregation levels
Figure 1 shows how many scanners we detected on a weekly basis from January 2021 through March 2022. We can see that how much we aggregate packet sources together has a major effect on the number of detected scanning sources. Overall, we find a relatively stable number of scanners throughout our period, mostly ranging from 10 to 100 weekly active scan sources, which are very small numbers compared with the hundreds of thousands of scanners targeting the IPv4 space. There is a notable uptick in /128 scanning sources starting in November 2021, but we point out that all these /128 sources aggregate together into a single /64, and are used by one single scanning actor, a US-based cybersecurity company.
IPv6 scan traffic concentration
Fig. 2: Weekly scan packets (/64 source aggregation)
Figure 2 shows the number of weekly captured scan packets, and we show the most active scanner and the second most active scanner per week separate from all other scan sources. Strikingly, we find that at essentially all times there is one single most active scanner that sends the overwhelming majority of scan traffic (up to 92% per week), while the other, less active scanners only contribute comparably little. This begs the question: Who are the top scanners?
Scan source networks
The Table shows the top 20 networks (Autonomous Systems, ASs) that source scan traffic, sorted by scan packets. In addition, we show for each network the number of as-scanner-detected /48 prefixes, /64 prefixes, and /128 prefixes. For each network, we outline the network type and the origin country.
Table: Top 20 scan source networks (ASs) by volume. We note that the number of /48 scan sources can exceed /64s or /128s if the combined traffic from the /48 satisfies the scan definition, but traffic from contained /64s does not (e.g., AS #18).
We find that scanning is heavily concentrated: The top five source networks account for almost 93% of all scan traffic. Notably, the two most active source networks are data centers located in China, followed by a number of cybersecurity companies and major cloud providers. Overall, we find that most scanning traffic originates from data centers/cloud providers, and in our top 20 we do not find a single end-user ISP.
Looking at the scan sources inside the ASes, we find striking differences across networks. While the topmost active scanners source all traffic from single 128-bit source addresses, we find some networks (e.g., AS #18) in which a single scanning actor emits scan traffic from more than 1,000 different /48 prefixes. At the same time, AS #18 accounts for less than 0.1% of all logged scan traffic, but may skew analysis when ordering networks by active source prefixes.
Challenges and conclusion
We raise a key question when dealing with IPv6 scans: What is the correct prefix size to identify an individual scan actor? As shown in the Table, the answer varies: While the scanner in AS #1 can, and should, be identified using its 128-bit IPv6 address, the scanning actor in AS #18 needs to be aggregated to an entire /32 prefix. In general, aggregating up to such a prefix length could conflate different individual scanning sources, notably in cloud providers where individual users often get allocated /64, /96, or even more specific prefixes. In operational settings, if scan detection leads to blocklisting, too coarse aggregation will also block traffic from legitimate sources.
Our initial analysis shows that there is still a relatively small number of scanners targeting the IPv6 space, especially when compared with the more well-known scanners that are targeting IPv4. IPv6 scan traffic is heavily concentrated, and a small number of sources dominate the picture. We argue that this situation may quickly change if and when vulnerability scanning in the IPv6 space becomes more commonplace. Identifying, and correctly pinpointing, IPv6 scan sources present a looming challenge.
For more details, please read our paper.
For more groundbreaking research: