Xen project Mailing List

Re: [PATCH 10/16] x86/shadow: move OOS functions to their own file

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Thu, 23 Mar 2023 16:40:50 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Q1swhVX/o47uENkgSIFpTWemaVHBNgjE5l9fqR0FAe8=; b=gQyp3JHTYuWfWnJsS4y2lQzAgp+M76YgSJowiJQlrvGtz52paZ5FCJnl5g0diXvkUUlpWiZwf3kgGvzILVvBHZFX/u7ak6KKEcSzUvrJ2WQgEW/yqvy1AjJJpXgCmXGAiUQJ/O8JhCwUmiufMr9S6WH+Ncdeo64Wr5yBDwRAf+1i1hJgmqaBkPeBqkVqKrsRNQCQaoWp7XTeHj+aQhqe/dRtC5oRtRcNwpT5J878XFTEnTHsAYyfu5KpEX9v5C4kYfy0qK+yt1HqQgwxRNskAl4b18O2xEa2mezvstKEMzAzUfj1P/QSK5wu5egDlrrrOMEgnFy6TQEQ4wT3mIWrTw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=c/IFomENoZjxhMgCe/QW2YEKOagjyP5V6NRguyou9Isy2HTT4bco7yWFDkVcquqQUzXnQWlqnwx/8rUscF6ckCZpV9iVEi9xLnpo9c5ZRzwGLsJFVZmb1fS5dmG2H6GTqYeCggeefV13hW4KWexZ9qVgDRLDsfAps7n/+nKur5cJ/FoIedT9VARNw/ZoBXqSvTNUGAcTDa8RMcNLzpfWBPtBH22lBAvx+Ze6TosZ42K7yHy5jzehf5tqVpK+3/wsjEQ7N5ISh2hlQGAqIx1Qy1IFJ5N/FbrqATNC10FiVMmLRHmMMsIbmTLfFBa6kBtuJbcTOhIbPp3vopbW6duHCg==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 23 Mar 2023 15:41:14 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.03.2023 15:30, Andrew Cooper wrote: > On 22/03/2023 9:35 am, Jan Beulich wrote: >> --- /dev/null >> +++ b/xen/arch/x86/mm/shadow/oos.c >> @@ -0,0 +1,603 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ > > GPL-2.0-only > > The unqualified form in deprecated. > >> +/****************************************************************************** >> + * arch/x86/mm/shadow/oos.c >> + * >> + * Shadow code dealing with out-of-sync shadows. >> + * Parts of this code are Copyright (c) 2006 by XenSource Inc. >> + * Parts of this code are Copyright (c) 2006 by Michael A Fetterman >> + * Parts based on earlier work by Michael A Fetterman, Ian Pratt et al. >> + */ >> + >> +#include "private.h" >> + >> +#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC) >> + >> +#include <xen/trace.h> >> + >> +#include <asm/shadow.h> >> + >> +/* >> + * From time to time, we let a shadowed pagetable page go out of sync >> + * with its shadow: the guest is allowed to write directly to the page, >> + * and those writes are not synchronously reflected in the shadow. >> + * This lets us avoid many emulations if the guest is writing a lot to a >> + * pagetable, but it relaxes a pretty important invariant in the shadow >> + * pagetable design. Therefore, some rules: >> + * >> + * 1. Only L1 pagetables may go out of sync: any page that is shadowed >> + * at at higher level must be synchronously updated. This makes >> + * using linear shadow pagetables much less dangerous. >> + * That means that: (a) unsyncing code needs to check for higher-level >> + * shadows, and (b) promotion code needs to resync. >> + * >> + * 2. All shadow operations on a guest page require the page to be brought >> + * back into sync before proceeding. This must be done under the >> + * paging lock so that the page is guaranteed to remain synced until >> + * the operation completes. >> + * >> + * Exceptions to this rule: the pagefault and invlpg handlers may >> + * update only one entry on an out-of-sync page without resyncing it. >> + * >> + * 3. Operations on shadows that do not start from a guest page need to >> + * be aware that they may be handling an out-of-sync shadow. >> + * >> + * 4. Operations that do not normally take the paging lock (fast-path >> + * #PF handler, INVLPG) must fall back to a locking, syncing version >> + * if they see an out-of-sync table. >> + * >> + * 5. Operations corresponding to guest TLB flushes (MOV CR3, INVLPG) >> + * must explicitly resync all relevant pages or update their >> + * shadows. >> + * >> + * Currently out-of-sync pages are listed in a simple open-addressed >> + * hash table with a second chance (must resist temptation to radically >> + * over-engineer hash tables...) The virtual address of the access >> + * which caused us to unsync the page is also kept in the hash table, as >> + * a hint for finding the writable mappings later. >> + * >> + * We keep a hash per vcpu, because we want as much as possible to do >> + * the re-sync on the save vcpu we did the unsync on, so the VA hint >> + * will be valid. >> + */ >> + >> +#if SHADOW_AUDIT & SHADOW_AUDIT_ENTRIES_FULL >> +void sh_oos_audit(struct domain *d) >> +{ >> + unsigned int idx, expected_idx, expected_idx_alt; >> + struct page_info *pg; >> + struct vcpu *v; >> + >> + for_each_vcpu(d, v) >> + { >> + for ( idx = 0; idx < SHADOW_OOS_PAGES; idx++ ) >> + { >> + mfn_t *oos = v->arch.paging.shadow.oos; > > Newline. I'm happy to add the newlines you're asking for (also below). But ... > But the variable placement is weird. oos ought to be one scope further > out to prevent recalculation in the for() loop, while pg and the two > expected could be at the inter-most scope. ... I don't want to go farther than that - see "but leave the code otherwise untouched" in the description. This also goes for several of your requests further down. >> +#if SHADOW_AUDIT & SHADOW_AUDIT_ENTRIES >> +void oos_audit_hash_is_present(struct domain *d, mfn_t gmfn) >> +{ >> + int idx; >> + struct vcpu *v; >> + mfn_t *oos; >> + >> + ASSERT(mfn_is_out_of_sync(gmfn)); >> + >> + for_each_vcpu(d, v) >> + { >> + oos = v->arch.paging.shadow.oos; >> + idx = mfn_x(gmfn) % SHADOW_OOS_PAGES; > > Same for oos and idx here, which would shrink this function overall. > > As a tangent, do we really want all these modulo 3's all over the > place? It's a lot of divide instructions in paths that are fast-ish for > shadow guests. I don't think the compiler translates division by constant to DIV / IDIV. It's multiplication by suitable "inverse" and then using the top bits of the result iirc. >> + if ( !mfn_eq(oos[idx], gmfn) ) >> + idx = (idx + 1) % SHADOW_OOS_PAGES; >> + >> + if ( mfn_eq(oos[idx], gmfn) ) >> + return; >> + } >> + >> + printk(XENLOG_ERR "gmfn %"PRI_mfn" marked OOS but not in hash table\n", >> + mfn_x(gmfn)); >> + BUG(); >> +} >> +#endif >> + >> +/* Update the shadow, but keep the page out of sync. */ >> +static inline void _sh_resync_l1(struct vcpu *v, mfn_t gmfn, mfn_t snpmfn) > > inline can go. I'm feeling on the edge with "inline". I'd prefer to leave them to keep "untouched" reasonable true, but if you insist I'd be willing to include their dropping. >> +void oos_fixup_add(struct domain *d, mfn_t gmfn, >> + mfn_t smfn, unsigned long off) >> +{ >> + int idx, next; >> + mfn_t *oos; >> + struct oos_fixup *oos_fixup; >> + struct vcpu *v; >> + >> + perfc_incr(shadow_oos_fixup_add); >> + >> + for_each_vcpu(d, v) >> + { >> + oos = v->arch.paging.shadow.oos; >> + oos_fixup = v->arch.paging.shadow.oos_fixup; >> + idx = mfn_x(gmfn) % SHADOW_OOS_PAGES; >> + if ( !mfn_eq(oos[idx], gmfn) ) >> + idx = (idx + 1) % SHADOW_OOS_PAGES; >> + if ( mfn_eq(oos[idx], gmfn) ) >> + { >> + int i; >> + for ( i = 0; i < SHADOW_OOS_FIXUPS; i++ ) > > This is a case where "for ( int i = ..." would definitely read better. > Luckily, this example is simple enough that the compiler has already > optimised properly. > >> + { >> + if ( mfn_eq(oos_fixup[idx].smfn[i], smfn) >> + && (oos_fixup[idx].off[i] == off) ) > > Given that you mention style in the commit message, it would be nice to > move the && onto the previous line. Sure, done (and there was a 2nd instance). Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.