Watch Your Step: The Prevalence of IDN Homograph Attacks
The internationalized domain name (IDN) homograph attack is used to form domain names that visually resemble legitimate domain names, albeit, using a different set of characters [1]. For example, the IDN "xn--akmai-yqa.com" which appears in unicode as "akámai.com" visually resembles the legitimate domain name "akamai.com". Attackers often apply IDN homograph attacks to form domain names that are used for malicious purposes, such as malware distribution [2] or phishing [3], while appearing trustworthy to victims.
The prevention of IDN homograph attacks is based on two complementary approaches: registration and access. On the client-side, major web browsers implement algorithms that attempt to identify IDN homograph attacks and present them in their true IDN form instead of unicode, to reduce their resemblance to legitimate domain names [4][5]. On the server-side, the Internet Corporation for Assigned Names and Numbers (ICANN) issued several policies over the past few years to limit the efficacy of IDN homograph attacks in ccTLD registry operators [6]. While both approaches do not eliminate the threat posed by IDN homograph attacks, they do create a tradeoff between user experience and security through supporting internationalized domain names and limiting attacks.
In this blog, we're going to explore the prevalence of user visits to domain names that were formed using IDN homograph attacks (henceforth referred to as homograph IDNs) and learn about the effectiveness of preventive approaches to the potential underlying threat. We attempt to answer these questions by applying an algorithm that classifies whether an IDN was formed by a homograph attack on Akamai's large scale DNS traffic. Then, we track and analyze the access patterns of users to every IDN that was classified as a homograph attack.
The homograph IDNs below are examples, but we caution users against searching them out and browsing them directly. While it's clear that homograph IDNs are potentially malicious, they're not all malicious; even when they're not malicious, they're almost certainly not benign.
Detecting IDN homograph attacks
The algorithm classifies an IDN as a homograph attack if two conditions are met. First, the unicode form of the IDN must resemble a legitimate and popular domain name (without the TLD). The algorithm maintains a constant list of such domain names that are likely to be spoofed by attackers, and the resemblance is measured using character replacement maps. Second, the IDN and its legitimate and popular domain name match must be registered by different owners. Examples of IDNs that were identified as homograph attacks by the algorithm are displayed in Table 1.
IDN |
Unicode |
Legitimate match |
xn--alixpress-d4a.com |
aliéxpress.com |
aliexpress.com |
xn--go0gl-3we.fm |
go0glе.fm |
google.com |
xn--mazon-wqa.com |
ámazon.com |
amazon.com |
Table 1: Example of IDN homograph attacks that were detected by the algorithm
The Akamai DNS traffic that the algorithm processed is compared against DNS queries made by millions of devices belonging to home users and small businesses, on more than 30 worldwide ISP networks. The volume of devices and users is estimated based on the source IP addresses. This method has the limitation of overestimation when devices use multiple IP addresses over time. We try to overcome this limitation as much as possible by looking at a small time frame of data, and conducting frequent (daily) sampling, but that should be noted by readers.
The prevalence of IDN homograph attacks
The IDN homograph detection algorithm was applied on a daily basis over the course of 32 days. Throughout this period, we identified 6670 homograph IDNs in total. We stress that every identified homograph IDN was queried in DNS traffic by at least one device in contrast to registered IDNs that are never accessed in traffic. Within 7 out of the last 32 days, we identified 67 newly-detected daily domains on average that were never seen before (as portrayed in Figure 1). Therefore, we conclude that every day at least several dozens of registered homograph IDN are accessed for the first time, despite the existence of preventive measures.
Moreover, we identified a total number of 29,071 devices that accessed at least one homograph IDN within the examined 32 days period. Within 7 out of the last 32 days, there were, on average, over 850 devices that daily accessed homograph IDNs for the first time (see Figure 2). Therefore, we conclude that every day hundreds of Internet devices unintentionally access a homograph IDN for the first time.
Every device that made at least one access to a homograph IDN within a given day made between 2-5 access attempts (measured by DNS queries) to homograph IDNs in total (see Figure 3). Multiple access attempts might result from automatic behavior such as link prefetching [7] or user behavior such hitting a refresh button. Nevertheless, the fact that a device never makes more than 5 daily access attempts on average strengthens our claim that visits to homograph IDNs are unintentional and not recurring; otherwise, the average would be significantly higher.
Conclusions and takeaways
IDN homograph attacks are used by attackers to form domain names that look trustworthy to victims in order to serve phishing pages and malware. Despite the emergence of preventive measures against IDN homograph attacks, the results of our analysis demonstrate that every day dozens of homograph IDNs are accessed for the first time, and hundreds of new users unintentionally access homograph IDNs for the first time. In order to best protect your device and network against phishing and malware, it is advised to use solutions that protect against IDN homograph attacks such as supporting web browsers (e.g., Chrome or Firefox) and carefully inspecting domain names for suspicious unicode characters (i.e., watch your steps). Moreover, the use of network security solutions that scan traffic to identify and block IDN homograph attacks is another layer of defense that reduces the risk of accidentally accessing these potentially malicious domains.
Akamai's Secure Internet Access Enterprise is a cloud based Secure Web Gateway that uses real-time threat intelligence to identify IDN homograph attacks, as well as other phishing and malware delivery techniques to protect your network and users.
References
[1] https://en.m.wikipedia.org/wiki/IDN_homograph_attack
[2] https://threatpost.com/idn-homograph-attack-spreading-betabot-backdoor/127839/
[4] https://www.chromium.org/developers/design-documents/idn-in-google-chrome
[5] https://wiki.mozilla.org/IDN_Display_Algorithm
[6] https://www.icann.org/resources/pages/announcements-2014-02-19-en