May 2014 Archives

Survey of Xen and ARM Servers

| No Comments
Like many other people, at prgmr.com we have been watching the development of ARM servers with interest.  Unfortunately, everything generally available has lacked ECC support or been overly geared towards NAS and has also not been price competitive compared to running multi-core Xeon servers.

In 2014 this is likely to change. The two main competitors are Applied Micro's x-gene and AMD's A1100 ARM opteron, codenamed Seattle. The first gen x-gene will have 8@2.5GHz cores followed by 16@3GHz cores later with a maximum of 256GiB ECC ram. The a1100 will come with 4 or 8 cores @2GHz and will have a maximum of 128GiB ECC ram.  Both claim good compute/watt ratios, though this needs to be measured.  We also care about price competitiveness - while the MSRP for the processors is reasonable the price may be driven up by availability.


However, without going into too many more hardware details, both appear suitable for running xen.  I decided to research how far along xen support is for each of these processors.


One major difference between the two platforms is the software used to boot them.  Applied Micro has opted for u-boot.  U-boot is the standard bootloader for ARM platforms right now, but is not exactly common in the server world.  AMD has opted to use UEFI, which is the standard replacement for BIOS these days.  It has generally not been used for ARM platforms.  


Another difference is run-time capabilities after boot.  U-boot disappears after Linux has been loaded, while UEFI provides runtime services.  Historically, ARM platforms have used a lot of non-discoverable hardware, meaning that the Linux kernels had to be hand-tailored to each platform it was going to run on.  More recently, ARM Linux has moved to using device tree definitions of the hardware, which are supposed to be defined independent of what OS is going to use them.  Like the initial ram disk, the device tree is typically supplied as an additional parameter to the bootloader.  


But even with device tree, drivers are still highly customized between different SoCs.  UEFI should greatly reduce the number of drivers if it is implemented properly, but UEFI support is not upstreamed yet and to me it's not clear if the proposed patches are going to make it in or not.


While official support for arm64 (aarch64) is present in version 4.4.0, xen support for these servers is still under heavy development.  For example, while live migration has been demo'ed for ARM, it is not slated for the 4.4 release according to the Xen roadmap.


Perhaps because of bootloader choices, x-gene appears to be closer to a shipping product.  There are already instructions on how to boot xen on an x-gene based server and I did not pursue this further.


I decided to see how hard it might be to boot xen on the a1100 by looking at booting xen under UEFI.  What I did was very roundabout and this should be used as a starting place rather than step-by-step instructions. I only got as far as booting a dom0 kernel and did not take the time to compile the xen toolchain or boot a domU - both of these have have been done though according to this xen.org blog post .


The a1100 development kit will be shipping with Fedora.  Fedora has documented their efforts porting to aarch64 and written a quick-start guide here that describes how to boot an emulated arm64 board with UEFI as the bootloader:


https://fedoraproject.org/wiki/Architectures/ARM/AArch64/QuickStart


I used this as a base for trying to boot xen under UEFI, though for reasons I'll go into later it might be better to try openSUSE (use aarch64-rootfs) or a debian variant. I did not try either.


UEFI can be used to either boot xen directly or to load grub2 which then loads xen.  The fedora image is using grub2, so I figured that this would probably be easiest to try.  


I mostly followed these instructions:


https://wiki.linaro.org/LEG/Engineering/Grub2/Xen_booting_on_FVP_Base_AEMv8A


Thank you Wei Fu!


The toolchain I actually used is 4.8-2013.0701, but the latest toolchain can be found at  http://releases.linaro.org/latest/components/toolchain/binaries/ - use "aarch64-linux-gnu".


For xen, I used the stable-4.4 branch and compiled using


make dist-xen XEN_TARGET_ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CONFIG_EARLY_PRINT=fastmodel


The binary which goes onto the target can be found under dist/install/boot/ in the xen build directory.


To put xen on the target, it's easiest to attach the image to a loopback target (losetup /dev/loop0 img) and then use kpartx to break out the partitions (kpartx -av /dev/loop0).  For the fedora image, /dev/mapper/loop0p1 is the EFI partition, /dev/mapper/loop0p2 is the boot partition, and /dev/mapper/loop0p4 is the root file system.  xen will go in the root directory of loop0p2 along with the kernels currently on there.  


We also need to add a device tree.  I found that the device tree definition which was already there wouldn't boot xen because the timer definition was incomplete, so I downloaded the most up-to-date one at


https://raw.githubusercontent.com/torvalds/linux/master/arch/arm64/boot/dts/foundation-v8.dts


With xen/grub as they are right now I was not able to boot cpu's 1-4, so in this file, remove cpu@1-3.  To compile it, install device-tree-compiler and run "dtc -O dtb -o foundation-v8.dtb foundation-v8.dts".  Copy the resulting file into the boot partition.


Then Linux needs to be compiled. I again used the instructions from the Linaro wiki article.  Copy the resulting Image to the boot partition.


Incidentally this version is not the same as the Fedora kernel. In addition to the kernel version, the Fedora kernel has EFI support.  The EFI patches in the Fedora kernel appear to still be outside of mainline but the one branch I tried, uefi-for-3.16, didn't boot when I tried it.  I did not try any other branches or to apply the patches to a booting kernel.  


The version of grub2 which is installed on the fedora image does not include multiboot support, so it needs to be replaced.  Multiboot support is not upstreamed, so (mostly) follow the instructions at the same wiki page.  The one exception is I found the default.cfg there did not work; this is what I used:


set root=(hd0,gpt1)

set prefix=($root)/EFI/fedora/


Mount the EFI partition, loop0p1, instead of the boot partition. I copied the resulting grub_v8.efi over EFI/fedora/grubaa64.efi.  Then I changed grub.cfg to be the following:


set pager=1

set timeout=5

menuentry 'ARM64 xen' {

   search --no-floppy --fs-uuid --set=root  4aa7fe0f-1bdc-4f41-8193-9562d2e5363e

   multiboot /xen no-bootscrub console=dtuart conswitch=x dtuart=serial0 dom0_mem=512M

dom0_max_vcpus=1 debug=y

       module /Image root=/dev/vda4 ro  console=hvc0

   devicetree /foundation-v8.dtb

}


After this, undo the loopback by device by unmounting all the partitions, running "kpartx -dv /dev/loop0" and then "losetup -d /dev/loop0".  Use efi-aarch64.sh to boot the Foundation model and you should get as far as systemd crashing and burning.  I didn't try the same kernel without xen so it could just be related to the kernel, but trying something other than Fedora would also be a good idea.


Assuming AMD honors the GPL and the a1100 ships with a device tree that defines as much as the foundation-v8 model does, it appears that a minimal boot of a dom0 is likely doable with a couple of days of effort.

About this Archive

This page is an archive of entries from May 2014 listed from newest to oldest.

April 2014 is the previous archive.

Find recent content on the main index or look in the archives to find all content.