Today Xen did public releases of four different Xen Security Advisories.

  • CVE-2016-7154/XSA-188 affects only Xen 4.4. We do not have any customer systems running this version.
  • CVE-2016-7094/XSA-187 is for HVM virtualization, which did not apply to any customers at the time, but has been patched. It also did not apply because we use hardware assisted paging (hap) and not shadow page tables.
  • CVE-2016-7093/XSA-186 is for HVM virtualization, which did not apply to any customers at the time, and also did not apply to the version we’re running.
  • CVE-2016-7092/XSA-185 was the most critical for us. It is a priviledge escalation bug that is only exposed to 32-bit paravirtualized (PV) guests. We either patched the host server or moved affected guests off of an unpatched host server to a patched one. 32-bit bootloaders have been removed from host servers that have not been patched.

Guests were already prevented from changing to 32 bit mode on existing systems in July. With Xen, 32-bit PV guests can only boot from the first 168GB of RAM. So on host servers with more than 168GB of RAM, guests switching from 64 bit to 32 bit mode could leave free but unusable RAM.

Unfortunately we did not meet our deadline of not being vulnerable to the XSAs before the embargo lifted. What happened is that two guests running grub2 32-bit were left running on unpatched hosts until about 8 hours after the embargo lifted. I noticed when reviewing the output of xenstore-ls, which lists the live boot parameters of guests. They were moved to patched hosts and given a full months credit for the downtime. We have no reason to believe the exploit was exercised by either of these guests during those 8 hours.

There were at least two reasons this initially happened. One, the initial query for 32 vs 64 bit was copied from our old initial provisioning (which defaulted to pv-grub) and wasn’t adusted to allow for grub2. This was a mistake of inattention. Second, guests not called out as 32 bit were assumed to be 64 bit when there should have been explicit checks for both 32 bit and 64 bit - explicitly calling out both would have left some guests unclassified and exposed the error. I considered doing that, but didn’t because it would have been extra delay and we were already very late sending out downtime notices compared to previous reboots.

We sent downtime notices two days after first learning of this vulnerability. This time, since we were doing so many moves, some notices were only sent 3 days in advance. Everyone had the option to reschedule to a different time. The whole server reboots all finished within the originally scheduled time frame but a few of the moves ran over.

We also brought up our customer cluster running an internal ganeti fork and most of the guests moved were imported to this system. We are still finding bugs and there are a couple of unfinished items, but in general it seems to be working.

We also started defaulting Linux VMs to HVM mode, which is typically faster for 64 bit operating systems. HVM guests are segregated from PV guests so that PV guests do not have to be rebooted for HVM-only XSAs and vice-versa.

We were reluctant to run HVM mode for a long time due to not wanting to run QEMU in the context of the dom0(host server.) Therefore, we are using device model stub domains, which runs QEMU inside of a PV guest instead of on the dom0. Most Linux guests use PV-on-HVM drivers and do not use the emulated devices from QEMU after boot. However, NetBSD doesn’t have PV-on-HVM support and the performance was abysmal in HVM mode with device model stubdomains because it was using ATA PIO mode for the virtual hard drive, which is very slow. Because of this we are still defaulting NetBSD guests to PV mode.