[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/pv: Flush TLB in response to paging structure changes
On 21/10/2020 07:55, Jan Beulich wrote: > On 20.10.2020 20:46, Andrew Cooper wrote: >> On 20/10/2020 18:10, Andrew Cooper wrote: >>> On 20/10/2020 17:20, Andrew Cooper wrote: >>>> On 20/10/2020 16:48, Jan Beulich wrote: >>>>> On 20.10.2020 17:24, Andrew Cooper wrote: >>>>>> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. >>>>>> This >>>>>> is from Xen's point of view (as the update only affects guest mappings), >>>>>> and >>>>>> the guest is required to flush suitably after making updates. >>>>>> >>>>>> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >>>>>> writeable pagetables, etc.) is an implementation detail outside of the >>>>>> API/ABI. >>>>>> >>>>>> Changes in the paging structure require invalidations in the linear >>>>>> pagetable >>>>>> range for subsequent accesses into the linear pagetables to access >>>>>> non-stale >>>>>> mappings. Xen must provide suitable flushing to prevent intermixed guest >>>>>> actions from accidentally accessing/modifying the wrong pagetable. >>>>>> >>>>>> For all L2 and higher modifications, flush the full TLB. (This could in >>>>>> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no >>>>>> such >>>>>> mechanism exists in practice.) >>>>>> >>>>>> As this combines with sync_guest for XPTI L4 "shadowing", replace the >>>>>> sync_guest boolean with flush_flags and accumulate flags. The >>>>>> sync_guest case >>>>>> now always needs to flush, there is no point trying to exclude the >>>>>> current CPU >>>>>> from the flush mask. Use pt_owner->dirty_cpumask directly. >>>>> Why is there no point? There's no need for the FLUSH_ROOT_PGTBL >>>>> part of the flushing on the local CPU. The draft you had sent >>>>> earlier looked better in this regard. >>>> This was the area which broke. It is to do with subtle difference in >>>> the scope of L4 updates. >>>> >>>> ROOT_PGTBL needs to resync current (if in use), and be broadcasted if >>>> other references to the pages are found. >>>> >>>> The TLB flush needs to be broadcast to the whole domain dirty mask, as >>>> we can't (easily) know if the update was part of the current structure. >>> Actually - we can know whether an L4 update needs flushing locally or >>> not, in exactly the same way as the sync logic currently works. >>> >>> However, unlike the opencoded get_cpu_info()->root_pgt_changed = true, >>> we can't just flush locally for free. >>> >>> This is quite awkward to express. >> And not safe. Flushes may accumulate from multiple levels in a batch, >> and pt_owner may not be equal to current. > I'm not questioning the TLB flush - this needs to be the scope you > use (but just FLUSH_TLB as per my earlier reply). I'm questioning > the extra ROOT_PGTBL sync (meaning changes to levels other than L4 > don't matter), which is redundant with the explicit setting right > after the call to mod_l4_entry(). But I guess since now you need > to issue _some_ flush_mask() for the local CPU anyway, perhaps > it's rather the explicit setting of ->root_pgt_changed which wants > dropping? No. That was the delta which delayed posting in the first place. Dom0 crashes very early without it. The problem, as I said, is the asymmetry. As dom0 is booting, the "only one use of this L4" logic kicks in, and skips setting ROOT_PGTBL, which then causes the flush_mask() not to flush on the local CPU either. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |