Critical 9.9 Vulnerability in Hyper-V Allowed Attackers to Exploit Azure
Executive Summary
Guardicore Labs, in collaboration with SafeBreach Labs, found a critical vulnerability in Hyper-V’s virtual network switch driver (vmswitch.sys).
Hyper-V serves as the underlying virtualization technology for Azure — Microsoft’s public cloud.
The vulnerability allows for both remote code execution and denial of service. Exploiting it allowed an attacker with an Azure virtual machine to take down whole regions of the cloud, as well as run arbitrary code on the Hyper-V host.
The vulnerability first appeared in a vmswitch build from August 2019, suggesting this bug may have been in production for more than a year and a half.
In May 2021, Microsoft assigned the vulnerability CVE-2021-28476 with a CVSS score of 9.9 and released a patch for it.
The vulnerability was found by Guardicore’s Ophir Harpaz and SafeBreach’s Peleg Hadar using an in-house developed fuzzer named hAFL1. The research process is discussed in their Black Hat USA 2021 talk.
Impact and consequences
The vulnerability lies in vmswitch.sys — Hyper-V’s network switch driver. It is triggered by sending a specially crafted packet from a guest virtual machine to the Hyper-V host and can be exploited to obtain both remote code execution (RCE) and denial of service (DoS).
The security flaw first appeared in a build from August 2019, suggesting that the bug was in production for more than a year and half. It affected Windows 7, 8.1, and 10, and Windows Server 2008, 2012, 2016, and 2019.
Hyper-V is Azure’s hypervisor; for this reason, a vulnerability in Hyper-V entails a vulnerability in Azure, and can affect whole regions of the public cloud. Triggering DoS from an Azure VM would crash major parts of Azure’s infrastructure and take down all virtual machines that share the same host.
With a more complex exploitation chain, the vulnerability can grant the attacker RCE capabilities. These, in turn, render the attacker omnipotent; with control over the host and all VMs running on top of it, the attacker can access personal information stored on these machines, run malicious payloads, and so forth.
Understanding the bug
vmswitch — a paravirtualized device
In Hyper-V terminology, the host operating system runs in the “root partition” and any guest operating system runs inside a “child partition.” To provide the child partitions with interfaces to hardware devices, Hyper-V makes extensive use of paravirtualized devices. With paravirtualization, a VM knows it is virtual; both the VM and the host use modified hardware interfaces, resulting in much better performance. One such paravirtualized device is the networking switch, which was our research target.
Each paravirtualized device consists of two components:
A virtual service consumer (VSC) that runs in the child partition. netvsc.sys is the networking VSC.
A virtual device provider (VSP) that runs in the root partition. vmswitch.sys is the networking VSP.
The two components talk to each other over VMBus — an intrapartition communication protocol based on hypercalls (Figure 1).
Communication protocols
netvsc (the networking consumer) communicates with vmswitch (the provider) over VMBus, using packets of type NVSP. These packets serve various purposes: initializing and establishing a VMBus channel between the two components, configuring various parameters of the communication and sending data to the Hyper-V host or other VMs. NVSP comprises many different packet types; one of them is NVSP_MSG1_TYPE_SEND_RNDIS_PKT used to send RNDIS packets.
RNDIS and OIDs
The Remote Network Driver Interface Specification (RNDIS) defines a message protocol between a host computer and an RNDIS device over abstract control and data channels. In a Hyper-V setup, the host computer is (quite confusingly) a guest VM, the RNDIS device is vmswitch or the external network adapter, and the abstract communication channel is VMBus.
RNDIS has various message types as well — init, set, query, reset, halt, etc. When a VM wishes to set (or query) certain parameters of its network adapter, it sends vmswitch an OID request — a message with the relevant object identifier (OID) and its parameters (Figure 2). Two examples of such OIDs are OID_GEN_MAC_ADDRESS, which is used to set the adapter’s MAC address, and OID_802_3_MULTICAST_LIST, which is used to set the adapter’s current multicast address list.
Virtual switch extensions
vmswitch, Hyper-V’s virtual switch, is also called the Hyper-V Extensible Switch. Its extensions are NDIS filter drivers or Windows Filtering Platform drivers that run inside the switch itself and can be capturing, filtering, or forwarding the packets that they process. The Hyper-V Extensible Switch has a control path for OID requests as shown in Figure 3.
The vulnerability
A notorious OID
Some OID requests are destined for the external network adapter, or other network adapters connected to vmswitch. Such OID requests include, for example, hardware offloading, Internet Protocol security, and single root I/O virtualization requests.
When these requests arrive at the vmswitch interface, they are encapsulated and forwarded down the extensible switch control path using a special OID of type OID_SWITCH_NIC_REQUEST. A new OID request is formed as an NDIS_SWITCH_NIC_OID_REQUEST structure, whose member OidRequest points to the original OID request. The resulting message goes through the vmswitch control path until it reaches its destination driver. The flow can be seen in Figures 4 and 5.
The buggy code
While processing OID requests, vmswitch traces their content for logging and debugging purposes; this also applies to OID_SWITCH_NIC_REQUEST. However, due to its encapsulated structure, vmswitch needs to have special handling of this request and dereference OidRequest to trace the inner request as well. The bug is that vmswitch never validates the value of OidRequest and can thus dereference an invalid pointer.
The following steps lead to the vulnerable function in vmswitch:
The message is first processed by RndisDevHostControlMessageWorkerRoutine — a generic RNDIS messages handler function.
vmswitch identifies a set request and passes the message to a more specific handler — RndisDevHostHandleSetMessage.
Later, the message is passed to VmsIfrInfoParamsNdisOidRequestBuffer. The function is responsible for tracing the message parameters using Inflight Trace Recorder — a Windows tracing feature that enables logging binary messages in real time.
Finally, the packet reaches VmsIfrInfoParams_OID_SWITCH_NIC_REQUEST, which is specialized to trace requests of type OID_SWITCH_NIC_REQUEST and their respective structure NDIS_SWITCH_NIC_OID_REQUEST (Figure 6).
Guest-to-host exploitation
netvsc — the networking virtual service consumer — does not send OID requests with OID_SWITCH_NIC_REQUEST. Nonetheless, a design flaw causes vmswitch to accept and process such a request even if it comes from a guest VM. This allowed us to trigger the arbitrary pointer dereference bug in the tracing mechanism by sending an RNDIS set message with OID_SWITCH_NIC_REQUEST directly from a guest VM.
This ability can be the basis for two exploitation scenarios. If the OidRequest member contains an invalid pointer, the Hyper-V host will simply crash. Another option is to make the host’s kernel read from a memory-mapped device register. This, in turn, will trigger additional, device-specific side effects (namely, code execution). RCE on a Hyper-V host would enable attackers to do as they wish — read sensitive information, run malicious payloads with high privileges, and so forth.
Conclusion
What made this vulnerability so lethal was the combination of a hypervisor bug — an arbitrary pointer dereference — with a design flaw that allowed a too-permissive communication channel between the guest and the host.