[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] Extend the number of event channels availabe to guests



On 20/09/12 16:42, Jan Beulich wrote:
On 20.09.12 at 16:05, Attilio Rao<attilio.rao@xxxxxxxxxx>  wrote:
On 20/09/12 08:47, Jan Beulich wrote:
On 20.09.12 at 01:49, Attilio Rao<attilio.rao@xxxxxxxxxx>   wrote:

Proposal
The proposal is pretty simple: the eventchannel search will become a
three-level lookup table, with the leaf level being composed by shared
pages registered at boot time by the guests.
The bitmap working now as leaf (then called "second level") will work
alternatively as leaf level still (for older kernel) or for intermediate
level to address into a new array of shared pages (for newer kernels).
This leaves the possibility to reuse the existing mechanisms without
modifying its internals.

While adding one level would seem to leave ample room, so did
the originally 4096 originally. Therefore, even if unimplemented
right now, I'd like the interface to allow for the guest to specify
more levels.

There is a big difference here. The third/new level will be composed of
pages registered at guest installing so it can be expanded on demanded
necessity. The second-level we have now doesn't work because it is stuck
in the immutable ABI.
The only useful way to have another level would be in the case we think
the second-level is not enough to address all the necessary bits in the
third level in efficient way.

To make you an example, the first level is 64 bits while the second
level can address 64 times the first level. The third level, to be
on-par with the same ratio of the second level in terms of performance,
would be large something like 4 pages. I think we are very far from
reaching critical levels.
What I'm saying is that further levels should be continuing at the
rate, i.e. times BITS_PER_LONG per level. Allowing for an only
partially populated leaf level is certainly an option. But similarly
it should be an option to have a fourth level once needed, without
having to start over from scratch again.

Yes, I agree, but I don't see a big problem here, besides having a way to specify which level pages should compose and deal with them. The only difference is that maybe we could be ending up building sort of containers for such topology, to deal with a multi-digi table. I think it will not be too difficult to do, but I would leave this as very last item, eventually, once that the "third-level" already works ok.

More specifically, what needs to happen:
- Add new members to struct domain to handle an array of pages (to
contain the actual evtchn bitmaps), a further array of pages (to contain
the evtchn masks) and a control bit to say if it is subjective to the
new mode or not. Initially the arrays will be empty and the control bit
will be OFF.
- At init_platform() time, the guest must allocate the pages to compose
the 2 arrays and invoke a novel hypercall which, at big lines, does the
following:
     * Creates some pages to populate the new arrays in struct domain via
alloc_xenheap_pages()

Why? The guest allocated the pages already. Just have the
hypervisor map them (similar, but without the per-vCPU needs,
to registering an alternative per-vCPU shared page). Whether
it turns out more practical to require the guest to enforce
certain restrictions (like the pages being contiguous and/or
address restricted) is a secondary aspect.

Actually what I propose seems to be what happens infact in the shared
page case. Look at what arch_domain_create() and XENMEM_add_to_physmap
hypercall do (in the XENMAPSPACE_shared_info case). I think this is the
quicker way to get what we want.
This is HVM-only thinking. PV doesn't use this, and I don't think
artificially inserting something somewhere in the physmap of a
PV guest is a good idea either. To have things done uniformly,
going the PV route and using guest allocated pages seems the
better choice to me. Alternatively, you'd have to implement a
HVM mechanism (via add-to-physmap) and a PV one.

Plus the add-to-physmap one has the drawback of limiting the
space available for adding pages (as these would generally
have to go into the MMIO space of the platform PCI device).


On a second thought, I think I can use something very similar to the sharing mechanism of the grant tables, basically modeled over grant_table_create() and subsequent gnttab_setup_table() mapping creation. This should also work in the PV case.

Attilio


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.