[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] [PATCH xenbus 3/3] Stop using BAR space to host Xen data structures

On 1/31/18 2:59 PM, Paul Durrant wrote:
> Currently XENBUS makes use of the memory BAR of the PCI device to which it
> binds as a source of unpopulated GFNs to host Xen data structures, such as
> the shared info and grant table.
> There is a problem with doing this, which is that Windows (unsurprisingly)
> sets up a non-cached MTRR for the page range covering PCI BARs so accesses
> to BAR space (and hence the Xen data structures) should be non-cached.
> However, Xen itself contains a work-around to avoid the slow access times
> that would ordinarily result from the this; it ignores the MTRRs if no
> real devices are passed through to the guest so accesses are actually
> cached. Thus, in the normal case, there is no penalty to pay... but as soon
> as hardware is passed through to a guest, the work-around no longer applies
> and there is a noticeable drop in PV driver performance. (E.g. network
> throughput can drop by ~30-40%).
> This patch modifies XENBUS to allocate a 2MB area of RAM

Some time ago I have discovered that the PV driver fails with
STATUS_INSUFFICIENT_RESOURCES if the grant table configured for the
Windows HVM is larger than 2MB.

Perhaps it might be a good idea to let unpopulated GFNs to be allocated
dynamically from FdoAllocateHole, possibly reviving the original purpose
of range_set in the process.

Or at minimum, call GrantTableQuerySize early and take the
MaximumFrameCount into account when allocating the initial "unpopulated"
GFN range.

> (which will always fall into a cached MTRR),

Isn't MmAllocateContiguousNodeMemory expected to either return memory
with correct cacheability or fail completely?  In the absence of
PAGE_NOCACHE or PAGE_WRITECOMBINE flags, it makes sense for the caller
to safely assume the allocated memory to be WB-cached.

I suppose the "fail completely" case could be alleviated via dynamic

> use a decrease_reservation hypercall to de-populate the area,

An alternative method would be to copy the unpopulated-alloc facility
in Linux merged into mainline fairly recently (5.9), which avoids being
entangled with ballooning entirely.

An obvious approach would be to have hotplug PDOs to convince the NT PnP
manager to hand us cacheable memory resources.  Implementing it sounds
pretty complicated, though.

> and then use that as a source of GFNs instead of the
> BAR. Hence, the work-around in Xen no longer has any baring on accessing of
> Xen data structures and thus there is no longer any performance penalty
> when hardware is passed through to a guest.
> Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.