Xen project Mailing List

Re: [win-pv-devel] [PATCH xenbus 3/3] Stop using BAR space to host Xen data structures

To: Paul Durrant <paul.durrant@xxxxxxxxxx>, win-pv-devel@xxxxxxxxxxxxxxxxxxxx

From: Jinoh Kang <jinoh.kang.kr@xxxxxxxxx>

Date: Sun, 28 Feb 2021 17:31:57 +0000

Cc: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>

Delivery-date: Sun, 28 Feb 2021 18:04:13 +0000

List-id: Developer list for the Windows PV Drivers subproject <win-pv-devel.lists.xenproject.org>

On 1/31/18 2:59 PM, Paul Durrant wrote: > Currently XENBUS makes use of the memory BAR of the PCI device to which it > binds as a source of unpopulated GFNs to host Xen data structures, such as > the shared info and grant table. > > There is a problem with doing this, which is that Windows (unsurprisingly) > sets up a non-cached MTRR for the page range covering PCI BARs so accesses > to BAR space (and hence the Xen data structures) should be non-cached. > However, Xen itself contains a work-around to avoid the slow access times > that would ordinarily result from the this; it ignores the MTRRs if no > real devices are passed through to the guest so accesses are actually > cached. Thus, in the normal case, there is no penalty to pay... but as soon > as hardware is passed through to a guest, the work-around no longer applies > and there is a noticeable drop in PV driver performance. (E.g. network > throughput can drop by ~30-40%). > > This patch modifies XENBUS to allocate a 2MB area of RAM Some time ago I have discovered that the PV driver fails with STATUS_INSUFFICIENT_RESOURCES if the grant table configured for the Windows HVM is larger than 2MB. Perhaps it might be a good idea to let unpopulated GFNs to be allocated dynamically from FdoAllocateHole, possibly reviving the original purpose of range_set in the process. Or at minimum, call GrantTableQuerySize early and take the MaximumFrameCount into account when allocating the initial "unpopulated" GFN range. > (which will always fall into a cached MTRR), Isn't MmAllocateContiguousNodeMemory expected to either return memory with correct cacheability or fail completely? In the absence of PAGE_NOCACHE or PAGE_WRITECOMBINE flags, it makes sense for the caller to safely assume the allocated memory to be WB-cached. I suppose the "fail completely" case could be alleviated via dynamic allocation. > use a decrease_reservation hypercall to de-populate the area, An alternative method would be to copy the unpopulated-alloc facility in Linux merged into mainline fairly recently (5.9), which avoids being entangled with ballooning entirely. An obvious approach would be to have hotplug PDOs to convince the NT PnP manager to hand us cacheable memory resources. Implementing it sounds pretty complicated, though. > and then use that as a source of GFNs instead of the > BAR. Hence, the work-around in Xen no longer has any baring on accessing of > Xen data structures and thus there is no longer any performance penalty > when hardware is passed through to a guest. > > Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.