[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/HVM: refine when to send mapcache invalidation request to qemu



On 28.09.2020 13:25, Paul Durrant wrote:
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: 28 September 2020 11:44
>>
>> For one it was wrong to send the request only upon a completed
>> hypercall: Even if only part of it completed before getting preempted,
>> invalidation ought to occur. Therefore fold the two return statements.
>>
>> And then XENMEM_decrease_reservation isn't the only means by which pages
>> can get removed from a guest, yet all removals ought to be signaled to
>> qemu. Put setting of the flag into the central p2m_remove_page()
>> underlying all respective hypercalls.
> 
> Actually, does this catch all the cases? Would it be better to push the
> check into p2m_set_entry() where we displace a ram entry? (I think it
> would be possible, for instance, for map_mmio_regions() to directly
> displace ram).

I did consider this, but my fear of going too far (and adding perhaps
many false positives) was too large. For the specific example of
map_mmio_regions() I'd expect it's the device model anyway which
calls it, and it hence shouldn't be in need of telling. But yes, if
there are multiple emulators, others than the one doing the change
may need knowing.

But perhaps PoD is in even more immediate need of this, when
transitioning a GFN from RAM to PoD, especially outside of the
context of p2m_pod_decrease_reservation()? In this case things
happen behind the guest's back, i.e. in particular outside of a
hypercall, and hence there would again be the problem from where to
properly send the request. IOW I'm struggling to ascertain whether
putting another instance of the sending onto the tail of
hvm_hap_nested_page_fault() would work correctly, or what the
conditions are for this to be done without interfering with other
requests which may be pending.

>> Plus finally there's no point sending the request for the local domain
>> when the domain acted upon is a different one. If anything that domain's
>> qemu's mapcache may need invalidating, but it's unclear how useful this
>> would be: That remote domain may not execute hypercalls at all, and
>> hence may never make it to the point where the request actually gets
>> issued. I guess the assumption is that such manipulation is not supposed
>> to happen anymore once the guest has been started?
> 
> Either that or the code was based on a false assumption was always
> occurring on the current domain.

When the code was added, I think this assumption was reasonable.
There was not even a remote idea at that point of a HVM(-like)
domain managing other domains. It was later on when it was missed to
update this logic.

> I also wonder whether the test-and-clear in hvm_hypercall() is really
> the right thing to do. E.g. if two vcpus decrease reservation at the
> same time, only one would end up waiting on the send_invalidate_req().

Hmm, yes, the flag wants to be per vCPU. Which will make remote
manipulation (if such was permitted to occur) even more awkward.

> Also, the pages will have been freed back to heap before invalidate
> is sent so I guess it is possible for a domain to use its qemu
> process to attack pages that may have been re-assigned?

Not for a PV domain - there page refs will be held by the mappings in
the dm domain(s). For PVH (which isn't supported yet for anything
other than being an ordinary DomU) this would indeed be a problem,
for there not being any similar ref-counting when inserting into /
removing from the p2m.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.