[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Questions on hvmloader, direct kernel boot and simulated BIOS



On 2016-09-14 11:42, Brett Stahlman wrote:
On Wed, Sep 14, 2016 at 6:03 AM, George Dunlap <dunlapg@xxxxxxxxx> wrote:
On Fri, Sep 9, 2016 at 10:54 PM, Brett Stahlman <brettstahlman@xxxxxxxxx> wrote:
Hello,
I have several questions on how xen handles booting HVM guests. Any
answers, insights or links to pertinent documentation would be greatly
appreciated...

IIUC, leaving "kernel" and "ramdisk" options unset in your xl.cfg file
means you do *not* want "direct kernel boot": rather, you want the
boot to proceed using "simulated firmware": i.e., firmware code that
xen loads into guest memory as a single, contiguous binary blob. As I
understand it, xen actually loads the selected BIOS binary blob from
disk to a specific memory location in the guest (usually 0xF0000) and
then, after transitioning to real mode, jumps to the reset vector
0xFFFF0 (presumably in non-root "guest" mode) to initiate the boot
process. I haven't been able to find much documentation on this, but
I'm assuming that the simulated firmware entry point of 0xFFFF0
ensures that the HVM guest will perform POST and do the other things
the BIOS would normally do after reset, and that it will do all this
in "guest" (non-root) mode. In addition to POST, I'm assuming this
simulated BIOS code will attempt to load an MBR from one of the
virtual disks specified with the "disk" option in xl.cfg (unless the
BIOS selected is ovmf, in which case, I'm assuming the simulated ovmf
firmware will be looking for an EFI System Partition and a suitable
.efi file). Am I on the right track so far, or have I misunderstood
something fundamental?

One of the things that's confusing me is "hvmloader". I see that this
is built as a standalone executable under "tools" in the xen source.
Looking at the inline assembly at the top of hvmloader.c, I see that
there's a call to hvmloader main, which contains a call to
bios_load(), presumably to perform the aforementioned load of a
firmware blob to 0xF0000. Following return from hvmloader main, the
inline assembly transitions back to 16-bit real mode, and ultimately
jumps to the reset vector (0xFFFF0), presumably to execute the
BIOS/UEFI firmware blob loaded by hvmloader main. This makes sense,
but I'm missing some of the overall context: for instance, on an Intel
processor, will we be in non-root (guest) mode when hvmloader runs?
Who starts hvmloader and how? I've seen code in
tools/libxl/libxl_dom.c (specifically, the call to xc_dom_kernel_file
in function libxl__domain_firmware), which appears to be loading it
into memory, but from there it gets a bit fuzzy... Are hypercalls used
to start up the guest DomU? Is hvmloader's _start label the
entry-point for each non-direct kernel boot HVM guest?

Also, how does the VMX entry/exit logic in
xen/arch/x86/hvm/vmx/entry.S fit into the picture here? I'm assuming
that code is running only on the VMM in root mode, and that it somehow
includes a mechanism for switching between the various guests.

You're mostly on the right track.  A couple of points:

* When doing a direct boot, you don't start at a fixed location (as
you would on reset in real hardware); the domain builder running in
dom0 has to read the ELF data structures and find the appropriate
entry point, I believe in the right paging mode as well (i.e., not
real mode).

* hvmloader runs in guest (non-root) mode.  In current releases of Xen
it has SeaBIOS or OVMF baked into it (i.e., a single binary contains
both hvmloader and the BIOS).  We're working on changing this so that
the domain builder will load up both hvmloader and the appropriate
BIOS.

Understood. Just a couple points of clarification...

1. When the bios_load() function copies the binary firmware blob
(e.g., seabios, ovmf, or whatever) to 0xF0000 and the inline asm at
the top of hvmloader.c jumps to the reset vector (0xFFFF0), it's not
actually overwriting real firmware (which would presumably be stored
in flash ROM), or jumping to the real reset vector, because the
guest's paging structures have mapped the virtual addresses between
0xF0000 and 0x100000 to a completely arbitrary location in the 64-bit
memory space. Is this correct?
Yes.

2. I understand the need for ovmf for a UEFI-booted guest on a
BIOS-booted host, but is the seabios firmware blob strictly necessary
in the case of a BIOS-booted host running a BIOS-booted guest? I mean,
what would happen if, in lieu of the firmware copy, xen set up a "flat
mapping" in the guest (i.e., virtual=physical) for the firmware range
0xF0000-0x100000? I.e., could it not simply use the actual host
firmware? Or are there some writable locations in that range that
would preclude this possibility?
There are three reasons, and writable segments in the BIOS area are only one of them, and are actually the least important. The other two have to do with the fact that you're booting a VM, not whatever system your host is running on. The BIOS contains pretty significant amounts of data describing the platform (ACPI, DMI, and SMBIOS tables, as well on servers as things accounting for IPMI and other out-of-band management, and quite a few other things), and most if not all of this is going to be blatantly wrong for a VM. The BIOS is also responsible for basic initialization of most of the system, and most of the things it does are very platform specific, and thus also generally invalid in a VM. So in essence, it boils down to the host systems' BIOS being built for the host system, not a VM. In theory, if you're booting a VM inside a VM which has the _exact_ same hardware configuration, this could be done with no issues (probably, almost nothing touches the BIOS area anyway), but even that's not likely because of how the mapping would need to be set up.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.