[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] RFC: very initial PVH design document
On Tue, Aug 26, 2014 at 05:33:21PM -0700, Mukesh Rathor wrote: > On Fri, 22 Aug 2014 16:55:08 +0200 > Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote: > > > Hello, > > > > I've started writing a document in order to describe the interface > > exposed by Xen to PVH guests, and how it should be used (by guest > > OSes). The document is far from complete (see the amount of TODOs > > scattered around), but given the lack of documentation regarding PVH > > I think it's a good starting point. The aim of this is that it should > > be committed to the Xen repository once it's ready. Given that this > > is still a *very* early version I'm not even posting it as a patch. > > > > Please comment, and try to fill the holes if possible ;). > > > > Roger. > > > > --- > > # PVH Specification # > > > > ## Rationale ## > > > > PVH is a new kind of guest that has been introduced on Xen 4.4 as a > > DomU, and on Xen 4.5 as a Dom0. The aim of PVH is to make use of the > > hardware virtualization extensions present in modern x86 CPUs in > > order to improve performance. > > > > PVH is considered a mix between PV and HVM, and can be seen as a PV > > guest that runs inside of an HVM container, or as a PVHVM guest > > without any emulated devices. The design goal of PVH is to provide > > the best performance possible and to reduce the amount of > > modifications needed for a guest OS to run in this mode (compared to > > pure PV). > > > > This document tries to describe the interfaces used by PVH guests, > > focusing on how an OS should make use of them in order to support PVH. > > > > ## Early boot ## > > > > PVH guests use the PV boot mechanism, that means that the kernel is > > loaded and directly launched by Xen (by jumping into the entry > > point). In order to do this Xen ELF Notes need to be added to the > > guest kernel, so that they contain the information needed by Xen. > > Here is an example of the ELF Notes added to the FreeBSD amd64 kernel > > in order to boot as PVH: > > > > ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz, "FreeBSD") > > ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz, > > __XSTRING(__FreeBSD_version)) ELFNOTE(Xen, > > XEN_ELFNOTE_XEN_VERSION, .asciz, "xen-3.0") ELFNOTE(Xen, > > XEN_ELFNOTE_VIRT_BASE, .quad, KERNBASE) ELFNOTE(Xen, > > XEN_ELFNOTE_PADDR_OFFSET, .quad, KERNBASE) ELFNOTE(Xen, > > XEN_ELFNOTE_ENTRY, .quad, xen_start) ELFNOTE(Xen, > > XEN_ELFNOTE_HYPERCALL_PAGE, .quad, hypercall_page) ELFNOTE(Xen, > > XEN_ELFNOTE_HV_START_LOW, .quad, HYPERVISOR_VIRT_START) > > ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz, > > "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel|hvm_callback_vector") > > ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE, .asciz, "yes") ELFNOTE(Xen, > > XEN_ELFNOTE_L1_MFN_VALID, .long, PG_V, PG_V) ELFNOTE(Xen, > > XEN_ELFNOTE_LOADER, .asciz, "generic") ELFNOTE(Xen, > > XEN_ELFNOTE_SUSPEND_CANCEL, .long, 0) ELFNOTE(Xen, > > XEN_ELFNOTE_BSD_SYMTAB, .asciz, "yes") > > It will be helpful to add: > > On the linux side, the above can be found in arch/x86/xen/xen-head.S. > > > > It is important to highlight the following notes: > > > > * XEN_ELFNOTE_ENTRY: contains the memory address of the kernel > > entry point. > > * XEN_ELFNOTE_HYPERCALL_PAGE: contains the memory address of the > > hypercall page inside of the guest kernel (this memory region will be > > filled by Xen prior to booting). > > * XEN_ELFNOTE_FEATURES: contains the list of features supported by > > the kernel. In this case the kernel is only able to boot as a PVH > > guest, but those options can be mixed with the ones used by pure PV > > guests in order to have a kernel that supports both PV and PVH (like > > Linux). The list of options available can be found in the > > `features.h` public header. > > Hmm... for linux I'd word that as follows: > > A PVH guest is started by specifying pvh=1 in the config file. However, > for the guest to be launched as a PVH guest, it must minimally advertise > certain features which are: auto_translated_physmap, hvm_callback_vector, > writable_descriptor_tables, and supervisor_mode_kernel. This is done > via XEN_ELFNOTE_FEATURES and XEN_ELFNOTE_SUPPORTED_FEATURES. See > linux:arch/x86/xen/xen-head.S for more info. A list of all xen features > can be found in xen:include/public/features.h. However, at present > the absence of these features does not make it automatically boot in PV > mode, but that may change in future. The ultimate goal is, if a guest > supports these features, then boot it automatically in PVH mode, otherwise > boot it in PV mode. > > [You can leave out the last part if you want, or just take whatever from > above]. > > > Xen will jump into the kernel entry point defined in > > `XEN_ELFNOTE_ENTRY` with paging enabled (either long or protected > > mode depending on the kernel bitness) and some basic page tables > > setup. > > If I may rephrase: > > Guest is launched at the entry point specified in XEN_ELFNOTE_ENTRY > with paging, PAE, and long mode enabled. At present only 64bit mode > is supported, however, in future compat mode support will be added. > An important distinction for a 64bit PVH is that it is launched at > privilege level 0 as opposed to a 64bit PV guest which is launched at > privilege level 3. > > > Also, the `rsi` (`esi` on 32bits) register is going to contain the > > virtual memory address were Xen has placed the start_info structure. > > The `rsp` (`esp` on 32bits) will contain a stack, that can be used by > > the guest kernel. The start_info structure contains all the info the > > guest needs in order to initialize. More information about the > > contents can be found on the `xen.h` public header. > > Since the above is all true for PV guest, you could begin it with: > > Just like a PV guest, the rsi .... > > > > > ### Initial amd64 control registers values ### > > > > Initial values for the control registers are set up by Xen before > > booting the guest kernel. The guest kernel can expect to find the > > following features enabled by Xen. > > > > On `CR0` the following bits are set by Xen: > > > > * PE (bit 0): protected mode enable. > > * ET (bit 4): 80387 external math coprocessor. > > * PG (bit 31): paging enabled. > > > > On `CR4` the following bits are set by Xen: > > > > * PAE (bit 5): PAE enabled. > > > > And finally on `EFER` the following features are enabled: > > > > * LME (bit 8): Long mode enable. > > * LMA (bit 10): Long mode active. > > > > *TODO*: do we expect this flags to change? Are there other flags that > > might be enabled depending on the hardware we are running on? > > Can't think of anything... What about the initial segments (ES, DS, FS, GS)? We boot with Xen provided ones and need to swap over from them - so that means the DS and CS are initially set to Xen ones. And we should probably mention that when the OS switches from Xen ones it MUST jump an CS with CS.L = 1 set otherwise bad things happen. We should probably mention that MSR_FS_BASE, MSR_KERNEL_GS_BASE and MSR_FS_BASE are zeroed out. Not sure about any other MSR? Should we have a blurb about IDT and GDT and that the PV hypercalls for that will be ignored. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |