Xen project Mailing List

Re: [PATCH 2/2] xen/mm: do not assign pages to a domain until they are scrubbed

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Wed, 25 Mar 2026 17:26:01 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OxRHVU4Ie4B2eNaBFNdJ9gSoN4f0appxL4CDJDX7deQ=; b=Ypa0xvqAcg1WJqnPTOdWHE+gd1Apm/TchzFu0pXGas2kKVYBfaxcGOV0jB1zqGEoA38ym9DjTakdpVOSln5rGXM9pe1UWH1hvGlt8S5mUZtpDbHUb8DpJPkhobPKhb3EVndzqguqEkAwJ0CAPFcsM6QmcGf7PZiFMsu1GVk11OfTVArue1/a15xClWMHLLawq9n8gkd4YMSzcsPe7j5eH0OIsqGxUemAYPEMDk2VnjbV08sstotl/OtR2NKxccvwCZ/EMT/GPf+EIiAxuaDNbmWKvOy8GKMmI2E793Cc6h0G8u9r/+2Tkwe9SH8PL+Nf6BQBBSv9b1knE5ieEdq5Cg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rZOI9rsADKKeNl1q1ZepXYELgTn4C1MgFBMY/QgPKWeD3recIt2WLzjgW7EvDbbAL/g0sXZj5MhWSKCXgmWVH/5IhpB2+CxF3ZJvIoBdZMFpRT/V2/esruMf6L9CiEQZD1T9WrdfZFcaEz/gdZ6Vo6LzjOOYzth37EwFmi38BufIasLr/GFd9LnLIljul6prgpKwMwOEQ3FQvmnhVh1Shjg2TOOge6szhlkXFikIEcPtjHHGIUBm3p/OOTsAj2ZnbxqnXQTeufBI/sEfC8X7FzZKadm6qb3tbfcI0XjNlsd2GjtMtm/KCbgkwtSjRhEKtQFO27M6rpS+Zi9djcI4+A==

Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Wed, 25 Mar 2026 16:26:22 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Mar 25, 2026 at 03:56:05PM +0100, Jan Beulich wrote: > On 25.03.2026 11:08, Roger Pau Monne wrote: > > --- > > I've attempted various different ways to solve this, but they all ended up > > being impossible. > > > > * Prevent non-scrubbed pages from getting extra refcounts (iow: make > > get_page() fail for them). This seemed nice, but the cleanup using > > put_page_alloc_ref() was impossible as non-scrubbed pages would return > > failure in get_page(), and so I couldn't take the extra reference ahead > > of calling put_page_alloc_ref(). > > A special-case variant of get_page() could be introduced, but maybe that > would still be overly fragile. It seemed too much complexity (and risk), just to deal with this scenario. > When we discussed this, what I had proposed didn't require use of get_page() > though. assign_pages() would install two general references (plus one type > ref for PGT_writable) in this special case. To free, you'd call > put_page_alloc_ref() followed by put_page_and_type(). Doesn't that risk under flowing the page counter if there's a parallel call to decrease_reservation() against this MFN before? How would the freeing done in populate_physmap() (in case of concurrent calls) know whether already scrubbed pages have had it's PGC_allocated bit dropped? > That said, the patch here is still less intrusive than I feared, so I'm not > asking to re-work this again. Thanks. > > * Disallow XENMEM_decrease_reservation until the domain has finished > > creation would fix the issue of pages being freed while pending scrub, > > but it's not clear there might be other usages that would be problematic, > > as get_page() on non-scrubbed pages would still return success. > > I agree this is of concern. > > > --- a/xen/common/memory.c > > +++ b/xen/common/memory.c > > @@ -388,6 +388,12 @@ static void populate_physmap(struct memop_args *a) > > goto out; > > } > > } > > + > > + if ( assign_page(page, a->extent_order, d, memflags) ) > > + { > > + free_domheap_pages(page, a->extent_order); > > The pages don't have an owner set yet, so that function will go straight > to free_heap_pages(), needlessly passing "true" as last argument. Correct, > but (for large pages, which the stashing is about) highly inefficient. My bad, I was sure I was using the same freeing function as alloc_domheap_pages() on failure to assign, but I clearly wasn't. I will switch to using free_heap_pages(). > > --- a/xen/common/page_alloc.c > > +++ b/xen/common/page_alloc.c > > @@ -2699,7 +2699,13 @@ struct page_info *alloc_domheap_pages( > > memflags, d)) == NULL)) ) > > return NULL; > > > > - if ( d && !(memflags & MEMF_no_owner) ) > > + /* > > + * Don't add pages with the PGC_need_scrub bit set to the domain, the > > + * caller must clean the bit and then manually call assign_pages(). > > + * Otherwise pages with the PGC_need_scrub would be reachable using > > + * get_page(). > > + */ > > How about replacing the latter "with the PGC_need_scrub" by "still subject > to scrubbing"? Sure. > > + if ( d && !(memflags & MEMF_no_owner) && !(memflags & MEMF_keep_scrub) > > ) > > { > > if ( memflags & MEMF_no_refcount ) > > { > > This no-refcount code isn't repeated at the new call site of assign_page(). > It's not needed there, yes, but wouldn't we better allow this to be taken > care of right here, moving the MEMF_keep_scrub check immediately ahead of > the call to assign_page()? > > Otherwise should we reject (much earlier) MEMF_no_refcount used together > with MEMF_keep_scrub? hm, I was likely too focus on the specific use-case of populate_physmap(), which is the only user of MEMF_keep_scrub. I've noticed the MEMF_no_refcount bits here, but since populate_physmap() never uses that flag I just left it as-is. Will see about adjusting it. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.