What’s a 10? Pwning vCenter with CVE-2020-3952
Last Thursday, VMware published a security advisory for CVE-2020-3952, describing a “sensitive information disclosure vulnerability in the VMware Directory Service (vmdir)”. It’s a pretty terse advisory, and it doesn’t go into much more detail than that, besides stating that any vCenter Server v6.7 that has been upgraded from a previous version is vulnerable.
What’s striking about this advisory is that the vulnerability got a CVSS score of 10.0 — as high as this score can go. Despite the amount of press the advisory got, though, we couldn’t find anything written about the technical details of the vulnerability. We wanted to get a better understanding of its risks and to see how an attacker could exploit them, so we started investigating the changes in VMware’s recommended patch — vCenter Appliance 6.7 Update 3f.
By combing through the changes made to the vCenter Directory service, we reconstructed the faulty code flow that led to this vulnerability. Our analysis showed that with three simple unauthenticated LDAP commands, an attacker with nothing more than network access to the vCenter Directory Service can add an administrator account to the vCenter Directory. We were able to implement a proof of concept for this exploit that enacts a remote takeover of the entire vSphere deployment.
TL;DR
The vulnerability is enabled by two critical issues in vmdir’s legacy LDAP handling code:
- A bug in a function named VmDirLegacyAccessCheck which causes it to return “access granted” when permissions checks fail.
- A security design flaw which grants root privileges to an LDAP session with no token, under the assumption that it is an internal operation.
Looking into the patch
Since VMware releases its new versions as whole disk images rather than incremental patches, we had to diff between the previous version — Update 3e — and the new one. Mounting the disk images revealed that these releases are made up of a long list of RPMs, for the most part. Once we had extracted the contents of all of these packages, we could see which files had actually changed by comparing hashes side by side.
Unfortunately, it turned out that nearly 1500 files had changed since the last release — far more than we could check out by hand. we guessed that our culprit would probably have “vmdir” somewhere in its name. Sure enough, this cut down the results to a much more manageable list:
So a list of statically linked libraries that are (presumably) built into a single compiled binary: vmdird. In other words, the vmdir server has changed since Update 3e. Looks promising!
Before doing a proper binary diff, we figured we’d see if there were any obvious changes made to exported symbols in vmdird. The results of this comparison were striking:
Nothing like a function named VmDirLegacyAccessCheck for vulnerability research! This seems like an especially good place to start, since VMware writes that “affected deployments will create a log entry when the vmdir service starts stating that legacy ACL mode is enabled.”
We laid out the disassembly of these functions in IDA. Here’s the unpatched version. We’ve highlighted anything that can change the function’s return value.
And this is the patched one:
In the patched version, VmDirLegacyAccessCheck returns 9207 (VMDIR_ERROR_INSUFFICIENT_ACCESS) if none of the conditions are met. Looking for this return value, which didn’t exist in the previous version of the function, led us to a GitHub project under the name Lightwave. As it turns out, vmdir’s code has been made available by VMware on their GitHub repository.
To the source code we go
We were happy to discover VmDirLegacyAccessCheck source code in VMWare’s repository. Not only that, the code at hand fits the newly patched version of the function. Looking when this fix was introduced led us to a commit dated to August 2017 (!) with the following message:
So at least one developer at VMware was aware that there’s something wrong here — before the fix, a legacy-mode access “has more permission than desired.”
Before the fix, the return value of VmDirLegacyAccessCheck held a success value by default. Failing the permissions check by _VmDirAllowOperationBasedOnGroupMembership left the return value unchanged at 0 (VMDIR_SUCCESS), eventually granting access to the operation.
We now have a function that seems vulnerable. Let’s find out when it’s called, and how we can take advantage of it.
Obtaining a vulnerable vCenter
We only had a vCenter Server 6.7 from a clean installation and not an upgraded one from a previous release line (6.5 or 6.0). According to VMware, on vulnerable systems, you can find a certain log line under /var/log /vmware/vmdird/vmdird-syslog.log (or %ALLUSERSPROFILE%\VMWare\vCenterServer\logs\vmdird\vmdir.log on Windows):
As our vCenter Server was not vulnerable – the log file was missing this line. Looking for the code that prints this log line led us to a function named _VmDirIsLegacyACLMode:
The code implies that there’s a key-value store somewhere that should have the string “acl-mode” and “enabled” (for non-legacy mode) or “disabled” (for legacy mode). Sure enough, “acl-modeenabled” showed up a number of times in our (patched) vmdir database file, /storage/db/vmware-vmdir/data.mdb. Changing the “enabled” part of this string to anything else (“disabled” would change string’s size, so we didn’t go with that) and restarting vmdir made the desired log line show up in vmdird-syslog.log.
This explains why only upgraded vCenter Server 6.7 machines are vulnerable to this attack, and not clean installs of this version. The vmdird binary is still vulnerable on upgraded 6.7 machines. What has changed is the ACL mode configuration. Clean installations default to non-legacy mode (acl-mode is enabled), but upgrades preserve the previous configuration, where legacy mode is enabled by default.
We now have a vulnerable machine. But what is it vulnerable to?
Exploitation
At this point, we need to find out how to trigger the code flow that ends up in the vulnerable function VmDirLegacyAccessCheck.
As we can see in the call graph, add, modify, and search requests can all go through VmDirLegacyAccessCheck.
First attempts
We installed ldap-utils and tried to add a user to the vCenter machine using incorrect credentials:
That didn’t get very far. Let’s see what the vmdird log has to say:
It seems like we never actually reached the “add” part of this request. ldapadd first needs to bind to the server before it can run any commands against it, but the binding fails with error 9234 — VMDIR_ERROR_USER_INVALID_CREDENTIAL. Is there a way to skip the bind stage?
We installed python-ldap and tried doing it ourselves:
Another no-go. Here’s the matching log from the VCenter server:
Bind/authenticate time
Looking for the error message “Not bind/authenticate yet” inside the code leads us to the function VmDirMLAdd.
As the code shows, two conditions must hold in order for the client to be able to add an entry:
- The LDAP session must not be anonymous, namely, it has to specify a domain;
- The session should not have “failed access info”.
Let’s start with passing the first condition. For that, we need bIsAnonymousBind to be FALSE. The only code that sets this variable to FALSE is in VmDirMLBind:
Notice that bIsAnonymousBind is assigned FALSE whether or not VmDirInternalBindEntry succeeds. In other words, even if we fail our bind authentication, we’ll pass the first part of the condition.
Now for the second part of that condition. What does VmDirIsFailedAccessInfo do? Surprisingly, not much:
In order to reach the user addition flow, we need to make it return FALSE somehow. Let’s take a look at the first way out — checking for a NULL access token.
It seems strange that a function that checks whether to grant access would specifically allow a user without an access token. From the brief comment below the check, it looks like this case was intended for “internal operations”. Presumably an LDAP launched internally by vmdird would leave pAccessToken empty to mark that it should be allowed through, and any other access would fail at the bind stage earlier. This is a strange way to do this; it would be much clearer to make a designated pAccessInfo->bIsInternalOperation field for this purpose.
When binding fails, pAccessInfo->pAccessToken is left empty. Here’s VmDirInternalBindEntry, which is called by VmDirMLBind from vmdird’s message loop.
Our incorrect credentials fail all the way up at VmDirNormalizeDN. This takes us to the error flow, which cleans out pOperation->conn->AccessInfo->pAccessToken.
Let’s go back to our double condition:
Both parts of the condition now hold.
So we can’t just skip binding and expect things to work, but it does seem like even a failed bind attempt will take us through this check.
The part where everything comes together
Where does all of this get us, though? We’re finally reaching our buggy VmDirLegacyAccessCheck. Before performing the add operation, VmDirInternalAddEntry calls VmDirSrvAccessCheck which in turn calls VmDirLegacyAccessCheck.
In theory we should have failed to reach this flow long ago; VmDirLegacyAccessCheck is the last line of defense. Its job is to check that this particular type of access — adding or modifying an LDAP entry — should be allowed by this particular user. The authentication check shouldn’t have allowed us to get here in the first place, but you would still expect this check to prevent us from moving onwards.
Remember this, though?
That looks like the last link we need in our chain. If VmDirLegacyAccessCheck always lets us through, the access check should succeed, and our user should be added.
What happens, then, if we ignore the result from bind?
Huh. No output for this in /var /log /vmware/vmdird/vmdird-syslog.log. Can we see this user with a search request?
No kidding. What happens if we try to connect to vSphere with this new user?
Where does one get “permissions on any vCenter Server system connected to this client”? Let’s add our Hacker user to the Administrator group with the same unauthenticated connection:
Let’s try the login again:
We’re in!
Implementation
We put together an exploitation script that runs all of these stages so you can try it yourself. Check out our GitHub repository over here.
Mitigation – patch and segment
The most effective measure for mitigating the above-demonstrated risk is to install the latest patch for the vulnerable version of vCenter Server. Alternatively, installing the latest version (7.0) will also result in a secure vSphere deployment.
We highly recommend to limit access to vCenter’s LDAP interface. In practice, this means blocking any access over the LDAP port (389) except for administrative use.
If you have any questions regarding how to segment your network to avoid this attack and others, don’t hesitate to contact us.
Some thoughts
This was some rabbit hole. Despite the relative clarity of VMware’s code, it looks like there were quite a few missteps that went into the vulnerability. The developers were at least partially aware of them, too, as we saw in the code comments and commit messages. The fix to VmDirLegacyAccessCheck isn’t any more than band-aid — had VMware looked into this bug in-depth they would have found a series of issues that need to be addressed: the strange semantics of bIsAnonymousBind, the disastrous handling of pAccessToken, and, of course, the bug we started from, in VmDirLegacyAccessCheck.
Perhaps the most distressing thing, though, is the fact that the bugfix to VmDirLegacyAccessCheck was written nearly three years ago, and is only being released now. Three years is a long time for something as critical as an LDAP privilege escalation not to make it into the release schedule — especially when it turns out to be much more than a privilege escalation.
We hope this was an enjoyable read. We believe there are still quite a few research leads left open here — check out Project Lightwave. Happy hunting!