[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] RFC: very initial PVH design document
On Wed, 27 Aug 2014 16:45:37 -0400 Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote: > On Tue, Aug 26, 2014 at 05:33:21PM -0700, Mukesh Rathor wrote: > > On Fri, 22 Aug 2014 16:55:08 +0200 > > Roger Pau Monnà <roger.pau@xxxxxxxxxx> wrote: > > > > > Hello, > > > > > > I've started writing a document in order to describe the > > > interface exposed by Xen to PVH guests, and how it should be used > > > (by guest OSes). The document is far from complete (see the > > > amount of TODOs scattered around), but given the lack of > > > documentation regarding PVH I think it's a good starting point. > > > The aim of this is that it should be committed to the Xen > > > repository once it's ready. Given that this is still a *very* > > > early version I'm not even posting it as a patch. > > > > > > Please comment, and try to fill the holes if possible ;). > > > > > > Roger. > > > > > > --- > > > # PVH Specification # > > > > > > ## Rationale ## > > > > > > PVH is a new kind of guest that has been introduced on Xen 4.4 as > > > a DomU, and on Xen 4.5 as a Dom0. The aim of PVH is to make use > > > of the hardware virtualization extensions present in modern x86 > > > CPUs in order to improve performance. > > > > > > PVH is considered a mix between PV and HVM, and can be seen as a > > > PV guest that runs inside of an HVM container, or as a PVHVM guest > > > without any emulated devices. The design goal of PVH is to provide > > > the best performance possible and to reduce the amount of > > > modifications needed for a guest OS to run in this mode (compared > > > to pure PV). > > > > > > This document tries to describe the interfaces used by PVH guests, > > > focusing on how an OS should make use of them in order to support > > > PVH. > > > > > > ## Early boot ## > > > > > > PVH guests use the PV boot mechanism, that means that the kernel > > > is loaded and directly launched by Xen (by jumping into the entry > > > point). In order to do this Xen ELF Notes need to be added to the > > > guest kernel, so that they contain the information needed by Xen. > > > Here is an example of the ELF Notes added to the FreeBSD amd64 > > > kernel in order to boot as PVH: > > > > > > ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz, "FreeBSD") > > > ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz, > > > __XSTRING(__FreeBSD_version)) ELFNOTE(Xen, > > > XEN_ELFNOTE_XEN_VERSION, .asciz, "xen-3.0") ELFNOTE(Xen, > > > XEN_ELFNOTE_VIRT_BASE, .quad, KERNBASE) ELFNOTE(Xen, > > > XEN_ELFNOTE_PADDR_OFFSET, .quad, KERNBASE) ELFNOTE(Xen, > > > XEN_ELFNOTE_ENTRY, .quad, xen_start) ELFNOTE(Xen, > > > XEN_ELFNOTE_HYPERCALL_PAGE, .quad, hypercall_page) ELFNOTE(Xen, > > > XEN_ELFNOTE_HV_START_LOW, .quad, HYPERVISOR_VIRT_START) > > > ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz, > > > "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel|hvm_callback_vector") > > > ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE, .asciz, "yes") > > > ELFNOTE(Xen, XEN_ELFNOTE_L1_MFN_VALID, .long, PG_V, PG_V) > > > ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz, "generic") > > > ELFNOTE(Xen, XEN_ELFNOTE_SUSPEND_CANCEL, .long, 0) ELFNOTE(Xen, > > > XEN_ELFNOTE_BSD_SYMTAB, .asciz, "yes") > > > > It will be helpful to add: > > > > On the linux side, the above can be found in > > arch/x86/xen/xen-head.S. > > > > > > > It is important to highlight the following notes: > > > > > > * XEN_ELFNOTE_ENTRY: contains the memory address of the kernel > > > entry point. > > > * XEN_ELFNOTE_HYPERCALL_PAGE: contains the memory address of the > > > hypercall page inside of the guest kernel (this memory region > > > will be filled by Xen prior to booting). > > > * XEN_ELFNOTE_FEATURES: contains the list of features supported > > > by the kernel. In this case the kernel is only able to boot as a > > > PVH guest, but those options can be mixed with the ones used by > > > pure PV guests in order to have a kernel that supports both PV > > > and PVH (like Linux). The list of options available can be found > > > in the `features.h` public header. > > > > Hmm... for linux I'd word that as follows: > > > > A PVH guest is started by specifying pvh=1 in the config file. > > However, for the guest to be launched as a PVH guest, it must > > minimally advertise certain features which are: > > auto_translated_physmap, hvm_callback_vector, > > writable_descriptor_tables, and supervisor_mode_kernel. This is > > done via XEN_ELFNOTE_FEATURES and XEN_ELFNOTE_SUPPORTED_FEATURES. > > See linux:arch/x86/xen/xen-head.S for more info. A list of all xen > > features can be found in xen:include/public/features.h. However, at > > present the absence of these features does not make it > > automatically boot in PV mode, but that may change in future. The > > ultimate goal is, if a guest supports these features, then boot it > > automatically in PVH mode, otherwise boot it in PV mode. > > > > [You can leave out the last part if you want, or just take whatever > > from above]. > > > > > Xen will jump into the kernel entry point defined in > > > `XEN_ELFNOTE_ENTRY` with paging enabled (either long or protected > > > mode depending on the kernel bitness) and some basic page tables > > > setup. > > > > If I may rephrase: > > > > Guest is launched at the entry point specified in XEN_ELFNOTE_ENTRY > > with paging, PAE, and long mode enabled. At present only 64bit mode > > is supported, however, in future compat mode support will be added. > > An important distinction for a 64bit PVH is that it is launched at > > privilege level 0 as opposed to a 64bit PV guest which is launched > > at privilege level 3. > > > > > Also, the `rsi` (`esi` on 32bits) register is going to contain the > > > virtual memory address were Xen has placed the start_info > > > structure. The `rsp` (`esp` on 32bits) will contain a stack, that > > > can be used by the guest kernel. The start_info structure > > > contains all the info the guest needs in order to initialize. > > > More information about the contents can be found on the `xen.h` > > > public header. > > > > Since the above is all true for PV guest, you could begin it with: > > > > Just like a PV guest, the rsi .... > > > > > > > > ### Initial amd64 control registers values ### > > > > > > Initial values for the control registers are set up by Xen before > > > booting the guest kernel. The guest kernel can expect to find the > > > following features enabled by Xen. > > > > > > On `CR0` the following bits are set by Xen: > > > > > > * PE (bit 0): protected mode enable. > > > * ET (bit 4): 80387 external math coprocessor. > > > * PG (bit 31): paging enabled. > > > > > > On `CR4` the following bits are set by Xen: > > > > > > * PAE (bit 5): PAE enabled. > > > > > > And finally on `EFER` the following features are enabled: > > > > > > * LME (bit 8): Long mode enable. > > > * LMA (bit 10): Long mode active. > > > > > > *TODO*: do we expect this flags to change? Are there other flags > > > that might be enabled depending on the hardware we are running on? > > > > Can't think of anything... > > What about the initial segments (ES, DS, FS, GS)? We boot with Xen > provided ones and need to swap over from them - so that means > the DS and CS are initially set to Xen ones. And we should probably > mention that when the OS switches from Xen ones it MUST jump an > CS with CS.L = 1 set otherwise bad things happen. CS.L is already covered above: with paging, PAE, and long mode enabled. At present only 64bit mode is supported, however, in future compat mode support will be added. that is the CS.L bit. CS.L==1 ==> 64bit mode, CS.L==0 ==> compat mode. > We should probably mention that MSR_FS_BASE, MSR_KERNEL_GS_BASE > and MSR_FS_BASE are zeroed out. Not sure about any other MSR? Could. > Should we have a blurb about IDT and GDT and that the PV hypercalls > for that will be ignored. and that they are native and guest managed. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |