[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry



On Wed, Apr 13, 2016 at 10:40:55PM +0200, Luis R. Rodriguez wrote:
> On Wed, Apr 13, 2016 at 02:56:29PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Apr 13, 2016 at 08:29:51PM +0200, Luis R. Rodriguez wrote:
> > > On Mon, Apr 11, 2016 at 07:12:08AM +0200, Juergen Gross wrote:
> > > 
> > > > What would be gained by using the same entry but having two different 
> > > > boot
> > > > paths after it?
> > > 
> > > Its a good question. In summary for me it would be the push for sharing 
> > > more
> > > code and the push for semantics on early boot to address differences
> > > proactively, and ultimately it may enable us to help bring closer the old 
> > > PV
> > > boot path closer.
> > 
> > But why? We want to kill PV (eventually).
> 
> Yeah yeah, but its still there, and we'll have to live with it for
> at least minimum 5 years I hear. Part of my interest is to see to it
> that this path gets less disruption and issues, and we also address
> dead code issues which pvops simply folded under the rug. The dead code
> concerns may exist still for hvmlite, so unless someone is willing
> to make a bold claim there is none, its something to consider.

What is this dead code you speak of? Is it MTRR? Is early path code
that PV misses (like KASL or other?)


The entrace point in Linux "proper" is startup_32 or startup_64 - the same
path that EFI uses.

If you were to draw this (very simplified):

a)- GRUB2 ---------------------\ (creates an bootparam structure)
                                \
                                 +---- startup_32 or startup_64
b) EFI -> Linux EFI stub -------/
       (creates bootparm)      /
c) GRUB2-EFI  -> Linux EFI----/
               stub         /
d) HVMLite ----------------/
      (creates bootparm)

(I am not sure about the c) - I would have to look in source to
be source). There is also LILO in this, but I am not even sure if
works anymore.


What you have is that every entry point creates the bootparams
and ends up calling startup_X. The startup_64 then hit the rest
of the kernel. The startp_X code is the one that would setup
the basic pagetables, segments, etc.

> 
> How we address semantics then is *very* important to me.

Which semantics? How the CPU is going to be at startup_X ? Or
how the CPU is going to be when EFI firmware invokes the EFI stub?
Or when GRUB2 loads Linux?

That (those bootloaders) is clearly defined. The URL I provided
mentions the HVMLite one. The Documentation/x86/boot.c mentions
what the semantics are to expected when providing an bootstrap
(which is what HVMLitel stub code in Linux would write against -
and what EFI stub code had been written against too).
> 
> > > I'll elaborate on this but first let's clarify why a new entry is used for
> > > HVMlite to start of with:
> > > 
> > >   1) Xen ABI has historically not wanted to set up the boot params for 
> > > Linux
> > >      guests, instead it insists on letting the Linux kernel Xen boot 
> > > stubs fill
> > >      that out for it. This sticking point means it has implicated a boot 
> > > stub.
> > 
> > 
> > Which is b/c it has to be OS agnostic. It has nothing to do 'not wanting'.
> 
> It can still be OS agnostic and pass on type and custom data pointer.

Sure. It has that (it MUST otherwise how else would you pass data).
It is documented as well 
http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,xen.h.html#incontents_startofday
(see " Start of day structure passed to PVH guests in %ebx.")

> 
> Would that be reasonable ?
> 
> > >      The HVMLite boot entry tries to bring the boot entries paths closer 
> > > as it
> > >      leverages more of the HVM boot path philosophy to mimic the regular 
> > > PC boot
> > >      path.
> > > 
> > >      Is HVMLite supposed to support legacy PV guests as well BTW ?
> > 
> > Gosh no.
> 
> Interesting.. and *everyone* is happy about this?

The Xen Linux _and_ x86 maintainers are.
And the Xen community developers as well (I hadn't heard anybody screaming NOOO
so I am presuming so).

> 
> > >      Reason I'm highlighting Xen ABI as a *reason* alone is that even with
> > >      today's large discrepancy on the old PV boot path I believe we can
> > >      bring together the boot paths closer together if the Xen ABI was 
> > > slightly
> > >      flexible about this, I've highlighted how I believe that is possible 
> > > before,
> > 
> > <runs away screaming>
> 
> Everyone has. If you need to support old PV guests for more than 5 years the
> work I'm doing should help with that. I'm trying to leverage gains of the
> work I'm doing for HVMLite, and part of this is trying to address semantics
> proactively.

What do you mean by 'support'? Support an old kernel or support upstream Linux?

> 
> > >      *iff* the Xen ABI would at the very least set 2 things only:
> > > 
> > >      a) Hypervisor type
> > >      b) A custom data pointer
> > > 
> > >      This would enable a single boot entry on the guest to handle then:
> > > 
> > >   Pseudo code:
> > > 
> > >   startup_32()                         startup_64()
> > >          |                                  |
> > >          |                                  |
> > >          V                                  V
> > >   pre_hypervisor_stub_32()        pre_hypervisor_stub_64()
> > >          |                                  |
> > >          |                                  |
> > >          V                                  V
> > >    [existing startup_32()]       [existing startup_64()]
> > >          |                                  |
> > >          |                                  |
> > >          V                                  V
> > >   post_hypervisor_stub_32()       post_hypervisor_stub_64()
> > > 
> > >      
> > >      If the Xen ABI was flexible about setting a hypervisor type and 
> > > custom
> > >      data pointer then we would haven handlers for it, and in it, it can
> > >      do whatever it thinks is needed for its own guest types. It could
> > >      also continue to set the zero page on its own as it sees fit.
> > > 
> > >      Again, note that if this is done it could also mean even bringing 
> > > together
> > >      the old PV boot path closer together... so this is not just a 
> > > prospect
> > >      for HVMLite but also for old PV guests.
> > > 
> > >   2) Because of 1) it has meant we have no formal semantics for early boot
> > >      code is available and so severe differences can best be addressed 
> > > also
> > >      by yet another boot entry. This has meant often times not addressing
> > 
> > There are semantics written for this new code: 
> > http://xenbits.xen.org/docs/unstable/misc/hvmlite.html
> 
> That only addressed semantics for early boot code implicitly through a new 
> entry...

And there is the Documentation/x86/boot.txt.

You have two semantics from either side clearly defined. Now it is just
the matter of connecting the dots.

> 
> > All other ones related to low-level operations are described in Intel SDM.
> > 
> > 
> > >      or not knowing if we've addressed real differences between the 
> > > different
> > >      entries. Case in point, dead code [0]. How do we know we will not run
> > >      certain code that should not run for the different entries ? Without
> > >      *any* semantics later in boot code to distinguish where we came from
> > >      and because we strive to build single kernels with different possible
> > >      run time environments it means we have tons of code available to
> > >      execute / run that we may not need.
> > 
> > I am not following that. PVH aka HVMLite will pretty much erase the need 
> > for the
> > pvops.
> 
> It does not mean there are no dead code concerns with HVMlite.

I am pretty sure there are none. But I need to make sure I understand
what you mean by 'dead code'.

> 
> > > 
> > >      Because of the lack of semantics we may still have dead code 
> > > prospects
> > >      with the new HVMLite entry. How are we sure there is no differences ?
> > > 
> > > [0] 
> > > http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html
> > > 
> > >   3) Unikernel / other OS requirements: this is really tied to 2) but 
> > > even if
> > >      we tried to evolve the Xen ABI it would mean considering existing 
> > > solutions
> > >      out there. Things to consider as an example: FreeBSD doesn't have an 
> > > EFI
> > >      entry, unikernels want a simple boot entry.
> > > 
> > > With this in mind then, that I can think of:
> > > 
> > > Cons of using the same entry but having two different boot paths:
> > > 
> > >   * Pushes the Xen ABI, needs to make everyone happy, this is hard
> > >   * Perhaps harder to implement
> > > 
> > > Gains of striving to use the same entry but having two different boot:
> > > 
> > >  * Helps to share more code easily
> > >  * Reduce attack surface
> > >  * Requires us to have semantics for early boot; this has a series of
> > >    side benefits:
> > >    - Means you should try to address differences explicitly rather than
> > >      implicitly -- case in point Dead Code
> > > 
> > > > You still need a way to distinguish between bare metal
> > > > EFI and HVMlite.
> > > 
> > > Great point! This is the semantics aspect. The new entry for HVMlite 
> > > approach
> > > deals with this by making the differences implicit by the new entry point.
> > > My call for addressing this through a hypervisor type was to see if we can
> > > get those semantics added explicitly so we can also later address dead
> > > code concerns for the new HVMLite guest type.
> > 
> > Right, they are..
> 
> There is huge merit to address a huge chunks of dead code concerns by sticking
> more closer to the native booth paths, it doesn't mean you still have no

Right, which we do. Keep in mind that Linux does not boot by itself. It needs
a bootloader which sets the stage for it. We set the same exact stage.

> dead code concerns with HVMlite, nor that HVMLite has no platform quirks,
> it does and part of some recent work is to pave a *clean* path for setting
> these differences apart.

/me scratches his head.

There will always be platform quirks.

I guess I am not understanding your concerns. The work that Boris is doing is
to code against the bootparams - which has a spec.

> 
> > > Part of my own interest in an EFI entry here is that EFI could be used to 
> > > help
> > > expand on the semantics in an OS/agnostic form rather than pushing the 
> > > x86 boot
> > > protocol further. That seems to have its own set of drawbacks though.
> > > 
> > > 
> > > > And Xen needs a way to find out whether a kernel is
> > > > supporting HVMlite to boot it in the correct mode.
> > > 
> > > How was Xen going to find out if new kernels had HVMlite support with the
> > > new entry ? An ELFNOTE() ? If an entry is shared could we note use an
> > 
> > Yeah.
> > > ELFNOTE() also for this though too ?
> > 
> > Not sure what you mean by 'shared'. But you can add multiple Elf PT_NOTEs.
> > See the ELF document.
> 
> OK so even if we used a common/shared entry point we can address letting
> Xen find out whether or not a kernel supports HVMlite.

Yes. Xen parses the Linux ELF NOTEs and can figure out if the kernel
can do HVMLite or not.

> 
>   Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.