[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/3] tools: introduce parameter max_wp_ram_ranges.

On Wed, Feb 3, 2016 at 2:43 PM, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote:
> Paul Durrant writes ("RE: [Xen-devel] [PATCH v3 3/3] tools: introduce 
> parameter max_wp_ram_ranges."):
>> > From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> ...
>> > I wouldn't be happy with that (and I've said so before), since it
>> > would allow all VM this extra resource consumption.
>> The ball is back in Ian's court then.
> Sorry to be vague, but: I'm not definitely objecting to some toolstack
> parameter.  I'm trying to figure out whether this parameter, in this
> form, with this documentation, makes some kind of sense.
> In the most recent proposed patch the docs basically say (to most
> users) "there is this parameter, but it is very complicated, so do not
> set it.  We already have a lot of these kind of parameters.  As a
> general rule they are OK if it is really the case that the parameter
> should be ignored.  I am happy to have a whole lot of strange
> parameters that the user can ignore.
> But as far as I can tell from this conversation, users are going to
> need to set this parameter in normal operation in some
> configurations.
> I would ideally like to avoid a situation where (i) the Xen docs say
> "do not set this parameter because it is confusing" but (ii) other
> less authoritative sources (wiki pages, or mailing list threads, etc.)
> say "oh yes just set this weird parameter to 8192 for no readily
> comprehensible reason".
> I say `some configurations' because, I'm afraid, most of the
> conversation about hypervisor internals has gone over my head.  Let me
> try to summarise (correct me if I am wrong):
>  * There are some hypervisor tracking resources associated with each
>    emulated MMIO range.
>    (Do we mean the memory ranges that are configured in the hypervisor
>    to be sent to an ioemu via the ioreq protocol - ie, the system
>    which is normally used in HVM domains to interface to the device
>    model?)
>    (Are these ranges denominated in guest-physical space?)
>  * For almost all domains the set of such MMIO ranges is small or very
>    small.
>  * Such ranges are sometimes created by, or specified by, the guest.
>    (I don't understand why this should be the case but perhaps this is
>    an inherent aspect of the design of this new feature.)

So the real issue here, as I've said elsewhere is this:

These are not MMIO regions.  They are not IO because they do not talk
to devices, and they are not regions; they are individual gpfns.

What's happening here is that qemu wants to be able to do the
equivalent of shadow pagetable emulation for the GPU's equivalent of
the pagetables (there's a special name, I forget what they're called).

These gpfns are just normal guest memory, selected by the operating
system and/or the graphics driver to use in the equivalent of a
pagetable for the GPU.

And instead of making a new interface designed to keep track of gpfns,
they are (ab)using the existing "MMIO range" interface.

But of course, since they they aren't actually ranges but just gpfns,
they're scattered randomly throughout the guest physical address

That's why XenGT suddenly wants orders of magnitude more "MMIO
regions" than any other device has ever needed before -- because
they're using a hammer when what they want is a rake.

They claim that 8k "should be enough for anybody", but as far as I can
tell, the theoretical limit of the number of pages being used in the
GPU pagetables could be unbounded.

I think at some point I suggested an alternate design based on marking
such gpfns with a special p2m type; I can't remember if that
suggestion was actually addressed or not.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.