Xen project Mailing List

Re: [PATCH] x86/HVM: refine when to send mapcache invalidation request to qemu

Date: Tue, 29 Sep 2020 10:43:29 +0200

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 29 Sep 2020 08:43:51 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28.09.2020 13:25, Paul Durrant wrote: >> From: Jan Beulich <jbeulich@xxxxxxxx> >> Sent: 28 September 2020 11:44 >> >> For one it was wrong to send the request only upon a completed >> hypercall: Even if only part of it completed before getting preempted, >> invalidation ought to occur. Therefore fold the two return statements. >> >> And then XENMEM_decrease_reservation isn't the only means by which pages >> can get removed from a guest, yet all removals ought to be signaled to >> qemu. Put setting of the flag into the central p2m_remove_page() >> underlying all respective hypercalls. > > Actually, does this catch all the cases? Would it be better to push the > check into p2m_set_entry() where we displace a ram entry? (I think it > would be possible, for instance, for map_mmio_regions() to directly > displace ram). I did consider this, but my fear of going too far (and adding perhaps many false positives) was too large. For the specific example of map_mmio_regions() I'd expect it's the device model anyway which calls it, and it hence shouldn't be in need of telling. But yes, if there are multiple emulators, others than the one doing the change may need knowing. But perhaps PoD is in even more immediate need of this, when transitioning a GFN from RAM to PoD, especially outside of the context of p2m_pod_decrease_reservation()? In this case things happen behind the guest's back, i.e. in particular outside of a hypercall, and hence there would again be the problem from where to properly send the request. IOW I'm struggling to ascertain whether putting another instance of the sending onto the tail of hvm_hap_nested_page_fault() would work correctly, or what the conditions are for this to be done without interfering with other requests which may be pending. >> Plus finally there's no point sending the request for the local domain >> when the domain acted upon is a different one. If anything that domain's >> qemu's mapcache may need invalidating, but it's unclear how useful this >> would be: That remote domain may not execute hypercalls at all, and >> hence may never make it to the point where the request actually gets >> issued. I guess the assumption is that such manipulation is not supposed >> to happen anymore once the guest has been started? > > Either that or the code was based on a false assumption was always > occurring on the current domain. When the code was added, I think this assumption was reasonable. There was not even a remote idea at that point of a HVM(-like) domain managing other domains. It was later on when it was missed to update this logic. > I also wonder whether the test-and-clear in hvm_hypercall() is really > the right thing to do. E.g. if two vcpus decrease reservation at the > same time, only one would end up waiting on the send_invalidate_req(). Hmm, yes, the flag wants to be per vCPU. Which will make remote manipulation (if such was permitted to occur) even more awkward. > Also, the pages will have been freed back to heap before invalidate > is sent so I guess it is possible for a domain to use its qemu > process to attack pages that may have been re-assigned? Not for a PV domain - there page refs will be held by the mappings in the dm domain(s). For PVH (which isn't supported yet for anything other than being an ordinary DomU) this would indeed be a problem, for there not being any similar ref-counting when inserting into / removing from the p2m. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.