[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFC: making the PVH 64bit ABI as stableo



On Sat, Jun 06, 2015 at 11:57:31AM +0200, Roger Pau Monné wrote:
> El 05/06/15 a les 23.52, Tim Deegan ha escrit:
> > At 18:21 +0100 on 05 Jun (1433528517), Andrew Cooper wrote:
> >> On 05/06/15 18:16, Stefano Stabellini wrote:
> >>> On Fri, 5 Jun 2015, Andrew Cooper wrote:
> >>>> On 05/06/15 17:43, Boris Ostrovsky wrote:
> >>>>> On 06/05/2015 12:16 PM, Roger Pau Monné wrote:
> >>>>>> El 03/06/15 a les 14.08, Jan Beulich ha escrit:
> >>>>>>>>>> On 03.06.15 at 12:02, <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >>>>>>>> On Tue, 2 Jun 2015, Andrew Cooper wrote:
> >>>>>>>>> With my x86 maintainer hat on, the following is an absolute
> >>>>>>>>> minimum set
> >>>>>>>>> of prerequisite for PVH.
> >>>>>>>>>
> >>>>>>>>> * 32bit support
> >>>>>>>> Could you please explain why 32bit is important to get PVH out of 
> >>>>>>>> tech
> >>>>>>>> preview? I don't see 32 bit OSes as an important use case. Maybe 
> >>>>>>>> there
> >>>>>>>> is more behind it that I cannot see.
> >>>>>>> The primary reason was named before: 32-bit support will likely
> >>>>>>> end up changing the way 64-bit guests get launched.
> >>>>>> I can work on the new boot ABI, even if it's just a design document 
> >>>>>> now,
> >>>>>> but the actual implementation needs to be done on top of the 32-bit
> >>>>>> support series.
> >>>>>>
> >>>>>> Boris, do you think you could send an early RFC of your 32-bit support
> >>>>>> series in a couple of weeks at most?
> >>>>> That's highly unlikely. For one, I am still unable to boot MP guests.
> >>>>> In addition, it is all held together by rubber bands and matchsticks
> >>>>> so calling it an RFC would be an insult to RFCs. (for example, I
> >>>>> apparently broke HVM somewhere along the way).
> >>>> How about working it the other way around.
> >>>>
> >>>> Start with an HVM guest and start with a sane method of booting.  I
> >>>> highly suggest multiboot1 as it is very easy and we have most of the
> >>>> code already.  Whomever actually gets around to doing this gets leeway,
> >>>> subject to it being sane (which the current method very certainly is 
> >>>> not).
> 
> I agree that using a boot ABI similar to multiboot1 is going to solve
> some of the issues that we currently have, while probably simplifying
> the code to build a domain. There are also several multiboot1
> implementations around which can be used as a basis for this for guest
> OSes that don't have native multiboot support.

Multiboot1 requires that the header be within 8K of the start of the kernel.
Linux has an PE header, bootparams and multiboot1 would have to fit afterwards
in it.

Now from a implementation and political side:
 - In Linux you would need multiboot1 copy all the paramters in the bootparams
   type. And then if were to use the generic bootup path we need to track
   any changes in the early bootup code - which means we MUST at startup
   look like a bootloader (whatever that means).
 - Adding in extra bootloader support could be blocked by the Linux x86
   maintainers. As in they would prefer to have all the booting code
   related to this lay in arch/x86/xen and just call Linux code the same
   way as it is doing now (setup the pvops, x86_* function tables, etc;
   and then call x86_64_start_reservations (or i386_start_kernel).
   I can see them asking: "Why two entry points for Xen?!" And eliminating
   the old one is not yet an option, unless we are ready to make Linux
   upstream not boot on Amazon or other clouds that provide PV guest support.

Either way the code from multiboot1 (or XEN_ELFNOTE_ENTRY mechanism)
needs to do the same thing - fix up function tables such that the platform
is OK booting without much hardware. And then also setup Linux specific
changes (pass in EFI data, bootparams, x86_init, etc, stack protector).

I think the issue folks see with PVH bootup code is that a lot of 
the setup (GDT, CR registers, etc) is done on the hypervisor side.

And Andrew - I think you would like it to be done as much as possible on
the guest side - and by having the entry point to the OS have the state be
like an bootloader - it can be "done".

I think if you want to go that route it is going to delay PVH by
another year. It will also require the nod/approval from the Linux and
FreeBSD kernel folks.

The "done" is being skeptical. I think the Xen hypervisor part would
still need to setup GDT, CR registers, etc - so you would not
change much on the hypervisor side anyhow - except add more code
to deal with multiboot1 headers.

> 
> >>>> Start the domain without qemu, and expose some of the PV hypercalls to
> >>>> HVM guests, and see how far it gets.  One will find suddenly that all
> >>>> 32bit and AMD problems have already been solved.  All the PV(h) kernel
> >>>> needs to know is that there is no real hardware, and not to touch it.
> >>> This seems like a clean and nice way forward, but rather than PVH is
> >>> actually something else.  Am I the only one to think that making this
> >>> drastic change in the design at this stange (3 years in) is too late?
> 
> I don't think the ABI is going to change much, most of this plumbing is
> going to be in Xen internals, so I wouldn't call it a drastic change.
> 
> >> There was no design in the slightest, which is why we have got 3 years
> >> in and are in this position.
> > 
> > Please try to keep things friendly and contructive on this list.  Yes,
> > there was design; it was discussed on this list and at the Xen summit.
> > With hindsight, it turned out that "PV guest that uses a lightweight
> > HVM container" took a lot more work/code than was originally expected.
> > 
> > I suspect that an implementation of "HVM without qemu and some
> > hypercalls" will also turn out more complex than it sounds.  I believe
> > I've made my opinion clear that that's where PVH ought to end up, but
> > I'm unconvinced that starting from scratch will be the fastest way.
> 
> I believe the right way to move forward is to start implementing this
> new boot ABI on top of HVM, without axing out the PVH code. I think most
> of the current PVH code would still be needed for the HVM-without-dm
> kind of guest, and that at some point both will meet.
> 
> I will send a design document for this boot ABI next week, but the plan
> is as follows:
> 
>  - Start the guest in protected mode without paging.
>  - Fill the hypercall page using wrmsr (HVM).
>  - Map the shared info page using XENMEM_add_to_physmap (HVM).
> 
> That means we can get rid of some of the ELFNOTES, the ones that come to
> mind right now are:
> 
>  - XEN_ELFNOTE_VIRT_BASE
>  - XEN_ELFNOTE_HYPERCALL_PAGE
>  - XEN_ELFNOTE_HV_START_LOW
>  - XEN_ELFNOTE_PAE_MODE
>  - XEN_ELFNOTE_L1_MFN_VALID
> 
> And probably some more.
> 
> Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.