[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [win-pv-devel] [PATCH xenbus 3/3] Stop using BAR space to host Xen data structures
On 28/02/2021 17:31, Jinoh Kang wrote: On 1/31/18 2:59 PM, Paul Durrant wrote:Currently XENBUS makes use of the memory BAR of the PCI device to which it binds as a source of unpopulated GFNs to host Xen data structures, such as the shared info and grant table. There is a problem with doing this, which is that Windows (unsurprisingly) sets up a non-cached MTRR for the page range covering PCI BARs so accesses to BAR space (and hence the Xen data structures) should be non-cached. However, Xen itself contains a work-around to avoid the slow access times that would ordinarily result from the this; it ignores the MTRRs if no real devices are passed through to the guest so accesses are actually cached. Thus, in the normal case, there is no penalty to pay... but as soon as hardware is passed through to a guest, the work-around no longer applies and there is a noticeable drop in PV driver performance. (E.g. network throughput can drop by ~30-40%). This patch modifies XENBUS to allocate a 2MB area of RAMSome time ago I have discovered that the PV driver fails with STATUS_INSUFFICIENT_RESOURCES if the grant table configured for the Windows HVM is larger than 2MB. Perhaps it might be a good idea to let unpopulated GFNs to be allocated dynamically from FdoAllocateHole, possibly reviving the original purpose of range_set in the process. Or at minimum, call GrantTableQuerySize early and take the MaximumFrameCount into account when allocating the initial "unpopulated" GFN range.(which will always fall into a cached MTRR),Isn't MmAllocateContiguousNodeMemory expected to either return memory with correct cacheability or fail completely? In the absence of PAGE_NOCACHE or PAGE_WRITECOMBINE flags, it makes sense for the caller to safely assume the allocated memory to be WB-cached. I'd assume that is the case, hence we now allocate memory that way and then decrease_reservation it out, to ensure we have a hole in a cached region. I suppose the "fail completely" case could be alleviated via dynamic allocation. Yes, we could conceivably grab memory a page at a time. Perhaps that would be the best way to go. We do take the hit of potentially shattering superpage mappings if we don't grab in 2M chunks though. use a decrease_reservation hypercall to de-populate the area,An alternative method would be to copy the unpopulated-alloc facility in Linux merged into mainline fairly recently (5.9), which avoids being entangled with ballooning entirely. An obvious approach would be to have hotplug PDOs to convince the NT PnP manager to hand us cacheable memory resources. Implementing it sounds pretty complicated, though. Yep, I've wanted to sort out hotplug memory for a long time and that may well offer a way to get hold of suitable ranges. Paul and then use that as a source of GFNs instead of the BAR. Hence, the work-around in Xen no longer has any baring on accessing of Xen data structures and thus there is no longer any performance penalty when hardware is passed through to a guest. Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |