[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.14] x86/tlb: fix assisted flush usage


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 22 Jun 2020 13:07:10 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4d7R8yFQ34wsCJ6WFbtDARl8fhNNb4F/O/edTEMV+0o=; b=il7FGHBnFoK1+jDosD/+Kg1fGqnNe8/XScJo5ALWqwCbUc+58fDboaBCPHA/3SZDzi7rUr9qTlETIkO6IVPqMlODScnM7WAaNRQVwW657TDmR7H56GaA5yh5mLD4nQwQwzUIgfN+hCKDnMwTqKUF+5/Ww4stxVhLsC3DpFSSN0kNjPvCrlpgPjK3JG9O7IFbXxXbWoJcNhn9A3neo04kkDjNKhEYtWSs65tpiCFJ8rkNEar4UcSDaQocMWdCcjqPrQxJjHv4KljlbfcncN12rVGvoHxFv83i+6OuOpCR9RxiSFwvbug13PzQoECw/gUOoNoxw1fnUFiVVc1kH/M6MA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SYWfCSMnoa319nsrG0OFxyIoyGnc3i8ydR25U3YVa1l3ADoeWrkykCgSK6aKy2HPMzTnhY9NUgtiC3niZytYa2ULUtgsKRXXL7bYVClMetDanGbcm2ysrlmL6mJpcwFGdzvnawdQV5pG9ABu07nFqiPEktqPsq//FlzPHjqjWduaPXJ7foEqtg3Vb1v3qlMu2hEQPM17KA40mUQnMiBFj3WYKKZ1NmuKJMSi4Tal75PGKgfcru38EGPjFfxa+mMBKssGWAqOxupYk1KW99ZQAA2P11NFg2uS3BFY2sE0gorEyRsp1PZXV8HhBGwuM66WKieD53IGr4gR4yEerHAfBA==
  • Authentication-results: epam.com; dkim=none (message not signed) header.d=none;epam.com; dmarc=none action=none header.from=suse.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Wei Liu <wl@xxxxxxx>, paul@xxxxxxx, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Mon, 22 Jun 2020 11:07:22 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 22.06.2020 11:31, Roger Pau Monné wrote:
> On Fri, Jun 19, 2020 at 04:06:55PM +0200, Jan Beulich wrote:
>> On 18.06.2020 18:04, Roger Pau Monne wrote:
>>> Commit e9aca9470ed86 introduced a regression when avoiding sending
>>> IPIs for certain flush operations. Xen page fault handler
>>> (spurious_page_fault) relies on blocking interrupts in order to
>>> prevent handling TLB flush IPIs and thus preventing other CPUs from
>>> removing page tables pages. Switching to assisted flushing avoided such
>>> IPIs, and thus can result in pages belonging to the page tables being
>>> removed (and possibly re-used) while __page_fault_type is being
>>> executed.
>>>
>>> Force some of the TLB flushes to use IPIs, thus avoiding the assisted
>>> TLB flush. Those selected flushes are the page type change (when
>>> switching from a page table type to a different one, ie: a page that
>>> has been removed as a page table) and page allocation. This sadly has
>>> a negative performance impact on the pvshim, as less assisted flushes
>>> can be used.
>>>
>>> Introduce a new flag (FLUSH_FORCE_IPI) and helper to force a TLB flush
>>> using an IPI (flush_tlb_mask_sync). Note that the flag is only
>>> meaningfully defined when the hypervisor supports PV mode, as
>>> otherwise translated domains are in charge of their page tables and
>>> won't share page tables with Xen, thus not influencing the result of
>>> page walks performed by the spurious fault handler.
>>
>> Is this true for shadow mode? If a page shadowing a guest one was
>> given back quickly enough to the allocator and then re-used, I think
>> the same situation could in principle arise.
> 
> Hm, I think it's not applicable to HVM shadow mode at least, because
> CR3 is switched as part of vmentry/vmexit, and the page tables are not
> shared between Xen and the guest, so there's no way for a HVM shadow
> guest to modify the page-tables while Xen is walking them in
> spurious_page_fault (note spurious_page_fault is only called when the
> fault happens in non-guest context).

I'm afraid I disagree, because of shadow's use of "linear page tables".

>>> Note the flag is not defined on Arm, and the introduced helper is just
>>> a dummy alias to the existing flush_tlb_mask.
>>>
>>> Fixes: e9aca9470ed86 ('x86/tlb: use Xen L0 assisted TLB flush when 
>>> available')
>>> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>> ---
>>> It's my understanding that not doing such IPI flushes could lead to
>>> the pages tables being read by __page_fault_type being modified by a
>>> third party, which could make them point to other mfns out of the
>>> scope of the guest allowed physical memory addresses. However those
>>> accesses would be limited to __page_fault_type, and hence the main
>>> worry would be that a guest could make it point to read from a
>>> physical memory region that has side effects?
>>
>> I don't think so, no - the memory could be changed such that the
>> PTEs are invalid altogether (like having reserved bits set). Consider
>> for example the case of reading an MFN out of such a PTE that's larger
>> than the physical address width supported by the CPU. Afaict
>> map_domain_page() will happily install a respective page table entry,
>> but we'd get a reserved-bit #PF upon reading from that mapping.
> 
> So there are no hazards from executing __page_fault_type against a
> page-table that could be modified by a user?

There are - I realize only now that the way I started my earlier
reply was ambiguous. I meant to negate your "only with side effects"
way of thinking.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.