[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 1/2] x86/mm: factor out the code for shattering an l3 PTE
On 13/12/2019 14:36, Jan Beulich wrote: > On 13.12.2019 15:19, Andrew Cooper wrote: >> On 12/12/2019 12:46, Hongyan Xia wrote: >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >>> index 7d4dd80a85..8def4fb8d8 100644 >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -5151,6 +5151,52 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v) >>> flush_area_local((const void *)v, f) : \ >>> flush_area_all((const void *)v, f)) >>> >>> +/* Shatter an l3 entry and populate l2. If virt is passed in, also do >>> flush. */ >>> +static bool shatter_l3e(l3_pgentry_t *pl3e, unsigned long virt, bool >>> locking) >>> +{ >>> + unsigned int i; >>> + l3_pgentry_t ol3e = *pl3e; >>> + l2_pgentry_t l2e = l2e_from_intpte(l3e_get_intpte(ol3e)); >>> + l2_pgentry_t *l2t = alloc_xen_pagetable(); >>> + >>> + if ( !l2t ) >>> + return false; >>> + >>> + for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) >>> + { >>> + l2e_write(l2t + i, l2e); >>> + l2e = l2e_from_intpte( >>> + l2e_get_intpte(l2e) + (PAGE_SIZE << PAGETABLE_ORDER)); >>> + } >>> + >>> + if ( locking ) >>> + spin_lock(&map_pgdir_lock); >>> + if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && >>> + (l3e_get_flags(*pl3e) & _PAGE_PSE) ) >>> + { >>> + l3e_write_atomic(pl3e, >>> + l3e_from_paddr(virt_to_maddr(l2t), __PAGE_HYPERVISOR)); >>> + l2t = NULL; >>> + } >>> + if ( locking ) >>> + spin_unlock(&map_pgdir_lock); >>> + >>> + if ( virt ) >>> + { >>> + unsigned int flush_flags = >>> + FLUSH_TLB | FLUSH_ORDER(2 * PAGETABLE_ORDER); >>> + >>> + if ( l3e_get_flags(ol3e) & _PAGE_GLOBAL ) >>> + flush_flags |= FLUSH_TLB_GLOBAL; >> Another problematic use of ol3e which is racy on conflict. You need to >> strictly use the content of *pl3e from within the locked region. > But this isn't a problem introduced here, i.e. fixing of it doesn't > strictly fall under "re-factor". (I'm certainly not opposed to > getting this right at the same time.) It is brand new code which is racy. Its either not necessary, or an XSA-in-waiting. (And not necessary, AFAICT). > >> However, why have you moved the flushing in here? Only one of the two >> callers actually wanted it, and even then I'm not totally sure it is >> necessary. >> >> Both callers operate on an arbitrary range of addresses, and for >> anything other than a singleton update, will want to issue a single >> flush at the end, rather than a spate of flushes for sub-areas. >> >> (Although someone really please check my reasoning here for the >> map_pages_to_xen() case which currently does have sub-area flushing.) >> >> Either the flush wants dropping (and best via a prereq patch altering >> map_pages_to_xen()), or you need to cache ol3e in the locked region with >> ACCESS_ONCE() or equivalent. > Well, at best replacing by a single one at the end, but I guess > the current piecemeal behavior is to cope with error paths (see > Julien's report against modify_xen_mappings(), where it's > exactly the other way around). Considering especially speculative > accesses I think it isn't the worst idea to keep the window small > between shatter and flush (short of us doing a proper break-then- > make sequence). Every sub-flush is a broadcast IPI, which is a scalability concern, and at least needs considering. x86 is designed not to need BBM, although the BBM sequence can be helpful at times to simplify other reasoning. It might actually be necessary in some SMP cases. Speculation can bite you at any point, including the very next instruction. The logic is either correct, or not correct, and the distance between the PTE update and the flush is only relevant when it comes to the scarcity of the incorrect case manifesting in a noticeable way. Fundamentally, we either need BBM, or the flush is safe to defer to the end. Everything in-between is racy, and dropping the sub-flushes would make any incorrect cases more likely to manifest. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |