[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] xen/mm: avoid watchdog timeout in dump_numa() on large domains


  • To: Andrew Cooper <andrew.cooper@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • From: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxxx>
  • Date: Tue, 2 Jun 2026 14:09:34 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1q3KxC6loRuy+eQ0kKIGUnfFCjI3BivJRz+GeIgQ0/8=; b=zCBQEoiZR0M7zDDqL8oJHo1or1Oojo4r3LM8rZEEQeBr/iL17LuI3fqlMKLTCXIzwPI58F1jio9bAuNYx78aTnEx+sY3bbSYy/+sQbjI6XW1yCc3Yr+1AbvGj70Re0io6oAuUFmBJyF5UenqruN0h3ISryjkSMnHgnwspHMSdN3CWxlCAfy6s1vmX+fiZzrjOiLwDC8P7phsmrIXYfMnsqQPXLhc9Rz/iXzRI7gReKTD64krievlQYbxaty5j5VAaC3ovdflncYTr3WFXNUUBtieJ/KCwYnMM7LrDtCvqFz81J52811F8o17YFVAVlJK8/anK2AHNjD/7fcJAPd1Gg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=af9izQVnZNrzC97gevcpIpFUsWVabVM8pTNnvqazY74/XdEdJ3bl/GIClkk1Kc8pnh3Vz/IP3acc1BBpZVwEv9y00BETyga/XKFJV5Z1kszLWAQOoUucoDu5e5nHE4zwpiAiSqPO8ZNqYyhD+xRupY8c6cXMk54zazi2gVGPHIyzU2jKGnTM6GMZPrilGi/U+/JOUrQXlaN6DFDrdGsBPGFio6Q8n8ITBRYk0VUJhoZcqC4AezCK264SGWHoHfy1bKcTARvdRmZI/2vUB16gceloyL8ARtrYj3lGXn5f6bJaQLF4vhe+eWib7UhbWEIhK7BboefpDyZ/dtL0iimZYQ==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:x-ms-exchange-senderadcheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>, Marcus Granado <Marcus.Granado@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 02 Jun 2026 14:09:43 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHc8m0cC2Q3ziRM6Ee1Tf3SMKSJo7YrJ16AgAAZegCAAAgNkA==
  • Thread-topic: [PATCH] xen/mm: avoid watchdog timeout in dump_numa() on large domains

> >> Replace the page-list walk with node_tot_pages[], a per-node counter
> >> maintained in struct domain. This reduces dump_numa()'s per-domain work
> >> from O(pages) to O(nodes).
>
> > Alternative approch for consideration: Purge dump_numa()? This big a
> > change for making a keyhandler work better is somewhat questionable an
> > approach, imo. The keyhandler isn't there for use in production anyway,
> > it's (primarily) a debugging aid. If the data is still needed (and may
> > e.g. be useful on production systems), make a (preemptible) domctl or
> > sysctl or alike instead?
> 
> Introducing d->node_tot_pages[] is a prerequisite for per-node claims.

That isn't actually the case.

Only of we want to subtracting d->node_tot_pages[node] from the claims.
But that isn't useful in my opinion. 

With multi-node claims (as requested by Jan and Roger), it becomed really
awkward to subtract d->node_tot_pages[node] from the requested claims for
getting the claims to install.

1) That's unnecessary complexity:

   The purpose of claims is to allow domain builders to claim
   memory before populating the physmap of a domain.

   a. A domain builder knows how much memory he needs to populate.

      It is natural to just claim this amount of memory for the physmap.

      It is pointless for a domain builder to:
      populate the physmap without claims and then stake a claims for it.

      It is very awkward to have to call a not yet upstream hypercall to
      get d->node_tot_pages[] for each node and then run this loop:

      for_each_pysmap_node( node )
          claim_request[node] = physmapsize[node] + node_tot_pages[node]

        only for Xen to subtract it again in the claims domctl:

      for_each_requested_claim_node( node )
          d->claim[node] = domctl.requested_claim[node] node_tot_pages[node]

So no, I'm not going to do this insanity.

> Tidying up dump_numa() is just a useful side effect.

Ack, the performance teams of customers were forced to rely on dump_numa()
to get diagnostic information about the NUMA memory distribution of domains,
it would break their tooling (as unsupported as that could be), so it would
be undesirable from a Xen user perspective to just yank it without a sufficient
deprecation period.

Bernhard

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.