Xen project Mailing List

Re: [Xen-devel] [RFC] Extend the number of event channels availabe to guests

To: "Attilio Rao" <attilio.rao@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Thu, 20 Sep 2012 08:47:22 +0100

Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>, IanCampbell <Ian.Campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

Delivery-date: Thu, 20 Sep 2012 07:47:21 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 20.09.12 at 01:49, Attilio Rao <attilio.rao@xxxxxxxxxx> wrote: > Proposal > The proposal is pretty simple: the eventchannel search will become a > three-level lookup table, with the leaf level being composed by shared > pages registered at boot time by the guests. > The bitmap working now as leaf (then called "second level") will work > alternatively as leaf level still (for older kernel) or for intermediate > level to address into a new array of shared pages (for newer kernels). > This leaves the possibility to reuse the existing mechanisms without > modifying its internals. While adding one level would seem to leave ample room, so did the originally 4096 originally. Therefore, even if unimplemented right now, I'd like the interface to allow for the guest to specify more levels. > More specifically, what needs to happen: > - Add new members to struct domain to handle an array of pages (to > contain the actual evtchn bitmaps), a further array of pages (to contain > the evtchn masks) and a control bit to say if it is subjective to the > new mode or not. Initially the arrays will be empty and the control bit > will be OFF. > - At init_platform() time, the guest must allocate the pages to compose > the 2 arrays and invoke a novel hypercall which, at big lines, does the > following: > * Creates some pages to populate the new arrays in struct domain via > alloc_xenheap_pages() Why? The guest allocated the pages already. Just have the hypervisor map them (similar, but without the per-vCPU needs, to registering an alternative per-vCPU shared page). Whether it turns out more practical to require the guest to enforce certain restrictions (like the pages being contiguous and/or address restricted) is a secondary aspect. > * Recreates the mapping with the gpfn passed from the userland, using > basically guest_physmap_add_page() This would then be superfluous. > * Sets the control bit to ON > - Places that need to access to the actual leaf bit (like, for example, > xen_evtchn_do_upcall()) will need to double check the control bit. If it > is OFF they consider the second level as the leaf one, otherwise they > will do a further lookup to get the bit from the new array of pages. Just like for variable depth page tables - if at all possible, just make the accesses variable depth, so that all you need to track on a per-domain basis is the depth of the tree. > Of course there are some nits to be decided yet, like, for example: > * How many pages should the new level have? We can start by populating > just one, for example Just let the guest specify this (and error if the number is too large). > * Who should have really the knowledge of how many pages to allocate? > Likely the hypervisor should have a threshhold, but in general we may > want to have a posting mechanism to have the guest ask the hypervisor > before-hand and satisfy its actual request Same here (this is really the same with the previous item, if you follow the earlier suggestions). > * How many bits should be indirected in the third-level by every single > bit in the second-level? (that is a really minor factor, but still). The tree should clearly be uniform (i.e. having a factor of BITS_PER_LONG per level), just like it is now. For 64-bit guests, this would mean 256k channels with 3 levels (32k for 32-bit guests). One aspect to also consider is migration - will the guest have to re-issue the extending hypercall, or will this be taken care of for it? If the former approach is chosen, would the guest be expected to deal with not being able to set up the extension again on the new host? And another important (but implementation only) aspect not to forget is making domain_dump_evtchn_info() scale with the then much higher amount of dumping potentially to be done (i.e. not just extend it to cope with the count, but also make sure it properly allows softirqs to be handled, which in turn requires to not hold the event lock across the whole loop). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.