[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v1 0/3] Lockless SMP function call and TLB flushing


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
  • Date: Wed, 1 Apr 2026 17:35:18 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nVIKuejiwA1WzmB5Jt7Veorp0CeFKLFpP3h8Lx0E96E=; b=v+WHFIrwlaG63gKCYnkawOgXzhtyZiRcgZEV1/msgfKty+y4YFKUf3gerBhSbTm/vXXDbUu3y0eUqBhbauz5+MU8vVljctJvsMV7zrjuZGOezWExz7XpGMtj+OPWG78HP76fSyO2r7Pd/aKEmu926vZFjiepjOBPHavklxtcULMtIPtRYQDdZHrKAwCBnfGrOSHXBOicFFPpIomTo2pPiv4ydmxcOHW+PWgp0x6Z4VXKQqNPP/kFFGziVfnIUVWFE57AbZUYwH0Fc/LP8hEvoV4yjfECvTYAK/ExmvKX9VH3q22tul3MmL/mtGBm0NRkjSX1eduP4MGPIYGtvov8MQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=dP28bdJ0cfLHorfemxbIkaEX81nPTF8/GSV5RfLFnaLDxrTqdvBOEaxoTdQVjH0Js668aYsJVriIigvkJKIdids6dPR7r8jdeBB8iEl8J8eBXqIahbQpJXuKvorPHpVC7bz0UiM1xJBu7s0RO3cKsggo069XXu1S+EUOHgKwj5vrq+s1ZE72odn9TFWqxQyauCFwmJRbR0uRA25vbZj+qvH69cSqVQjBHWpZADZyayr5jrgzmBnt4ZNqEXmXtKIVOhLWSKfwo26MQW26nhbOMHIla/Xqqb9fh8WAmNj/NTrgl+HN4i0h4Mtev3eA68FqLatBILm/0Ziw8I9ZcatfXA==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • Delivery-date: Wed, 01 Apr 2026 16:36:00 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi,

This series implements lockless SMP function call and then rewrites x86 TLB
flushing to use SMP function calls.

We have observed that the TLB flush lock can be a point of contention for
certain workloads, e.g. migrating 10 VMs off a host during a host evacuation.

Performance numbers:

I wrote a synthetic benchmark to measure the performance. The benchmark has one
or more CPUs in Xen calling on_selected_cpus() with between 1 and 64 CPUs in
the selected mask. The executed function simply delays for 500 microseconds.

The table below shows the % change in execution time of on_selected_cpus():

                  1 thread   2 threads    4 threads
1 CPU in mask     0.02       -35.23       -51.18
2 CPUs in mask    0.01       -47.20       -69.27
4 CPUs in mask    -0.02      -42.40       -66.55
8 CPUs in mask    -0.03      -47.82       -68.39
16 CPUs in mask   0.12       -41.95       -58.26
32 CPUs in mask   0.02       -25.43       -39.35
64 CPUs in mask   0.00       -24.70       -37.83

With 1 thread (i.e. no contention), there is no regression in execution time.
With multiple threads, as expected there is a significant improvement in
execution time.

As a more practical benchmark to simulate host evacuation, I measured the
memory dirtying rate across 10 VMs after enabling log dirty (on an AMD system,
so without PML). The rate increased by 16% with this patch series, even
after the recent deferred TLB flush changes.

FWIW, my first attempt at this was to port the SMP call functionality from
Linux. I found it didn't scale well as the number of CPUs in the mask
increases so I've taken a different approach here.

Thanks,
Ross

Ross Lagerwall (3):
  x86/hap: Wait for remote CPUs during TLB flush
  xen/smp: Rewrite on_selected_cpus() to be lockless
  x86/smp: Rewrite TLB flush using on_selected_cpus()

 tools/xentrace/xenalyze.c              |   2 -
 xen/arch/x86/include/asm/irq-vectors.h |   1 -
 xen/arch/x86/include/asm/irq.h         |   1 -
 xen/arch/x86/mm/hap/hap.c              |   2 +-
 xen/arch/x86/smp.c                     |  30 ++++----
 xen/arch/x86/smpboot.c                 |   1 -
 xen/common/smp.c                       | 101 ++++++++++++++++---------
 7 files changed, 80 insertions(+), 58 deletions(-)

-- 
2.53.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.