[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v2] xen/vcpu: ignore VCPU_SSHOTTMR_future


  • To: Roger Pau Monne <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Henry Wang <Henry.Wang@xxxxxxx>
  • Date: Wed, 19 Apr 2023 12:09:26 +0000
  • Accept-language: zh-CN, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=awbrBfCio7YdH3xda32FMegz90Dh6jCKwnYTiPnYrKU=; b=IgufVjchQE6v/pgjCsVzIecKXcnwV3qunpeOuRLLEQFPwwyIxZNs9KnMWMHEXWPqYLc5msKfu3NkQCKuoxT8eL6ufvNQch1mAchgMmYDce8s5NHBWQJMfRqM0aC/TxoYDaFp2HYAzofDxK6CLEZ5ufmr6QqOmMN02sW2UdxKj3LdarT5P2jSM8dVTbF6gfGiYfNYam3+Sw8A0JAVpEgp2CF0S8IknlWwmeAX8LtxoIfOM04TU0+yeGpOe7hsm2Yll1Ehruc2J2QTm6f+enDrZ71uxRy40qfHdejfA5U5D0zBXy8gNTABMRYPKV0+JcSjun96wlgKeEbdqAADDXXyoA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=W81b7/9WFx25EUuLy69CDM1G/YnhyrJXvFdZUwouYAczQnz3d5aQxy6o2xbTOfaBbuvYZPr4GM96+3jg86dZXjGjLWYLiD1u2q++OyQT/dFgVvPnAv8HlZ9hdXZWJeUTmQjxl2E8x+TCok28J3D+ivglhnTQ5QQlu87XmEilLQkQChKeawk3VEB/szuEh7Ti7Ub7VrTyUTLVVdl7XAtYOrheRN3jYZJhTHiJ4YhS+x5TCi929JXyEmc7tRKAv3i4B33riGZgpxQjD+wE5coQc/NwlUWnt/uqPlzUrs/J+m7gqF/pOmzd0r4ldcBClqKgP8+kPXgCG/mQM7sNaS8E/A==
  • Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Cc: Community Manager <community.manager@xxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 19 Apr 2023 12:09:47 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHZcrSSidx067y7IEiVJ78Xlkkoda8yiarw
  • Thread-topic: [PATCH v2] xen/vcpu: ignore VCPU_SSHOTTMR_future

Hi Roger,

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Subject: [PATCH v2] xen/vcpu: ignore VCPU_SSHOTTMR_future
> 
> The usage of VCPU_SSHOTTMR_future in Linux prior to 4.7 is bogus.
> When the hypervisor returns -ENOTIME (timeout in the past) Linux keeps
> retrying to setup the timer with a higher timeout instead of
> self-injecting a timer interrupt.
> 
> On boxes without any hardware assistance for logdirty we have seen HVM
> Linux guests < 4.7 with 32vCPUs give up trying to setup the timer when
> logdirty is enabled:
> 
> CE: Reprogramming failure. Giving up
> CE: xen increased min_delta_ns to 1000000 nsec
> CE: Reprogramming failure. Giving up
> CE: Reprogramming failure. Giving up
> CE: xen increased min_delta_ns to 506250 nsec
> CE: xen increased min_delta_ns to 759375 nsec
> CE: xen increased min_delta_ns to 1000000 nsec
> CE: Reprogramming failure. Giving up
> CE: Reprogramming failure. Giving up
> CE: Reprogramming failure. Giving up
> Freezing user space processes ...
> INFO: rcu_sched detected stalls on CPUs/tasks: { 14} (detected by 10, t=60002
> jiffies, g=4006, c=4005, q=14130)
> Task dump for CPU 14:
> swapper/14      R  running task        0     0      1 0x00000000
> Call Trace:
>  [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0
>  [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0
>  [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0
>  [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0
>  [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270
>  [<ffffffff900000d5>] ? start_cpu+0x5/0x14
> INFO: rcu_sched detected stalls on CPUs/tasks: { 26} (detected by 24, t=60002
> jiffies, g=6922, c=6921, q=7013)
> Task dump for CPU 26:
> swapper/26      R  running task        0     0      1 0x00000000
> Call Trace:
>  [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0
>  [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0
>  [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0
>  [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0
>  [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270
>  [<ffffffff900000d5>] ? start_cpu+0x5/0x14
> INFO: rcu_sched detected stalls on CPUs/tasks: { 26} (detected by 24, t=60002
> jiffies, g=8499, c=8498, q=7664)
> Task dump for CPU 26:
> swapper/26      R  running task        0     0      1 0x00000000
> Call Trace:
>  [<ffffffff90160f5d>] ? rcu_eqs_enter_common.isra.30+0x3d/0xf0
>  [<ffffffff907b9bde>] ? default_idle+0x1e/0xd0
>  [<ffffffff90039570>] ? arch_cpu_idle+0x20/0xc0
>  [<ffffffff9010820a>] ? cpu_startup_entry+0x14a/0x1e0
>  [<ffffffff9005d3a7>] ? start_secondary+0x1f7/0x270
>  [<ffffffff900000d5>] ? start_cpu+0x5/0x14
> 
> Thus leading to CPU stalls and a broken system as a result.
> 
> Workaround this bogus usage by ignoring the VCPU_SSHOTTMR_future in
> the hypervisor.  Old Linux versions are the only ones known to have
> (wrongly) attempted to use the flag, and ignoring it is compatible
> with the behavior expected by any guests setting that flag.
> 
> Note the usage of the flag has been removed from Linux by commit:
> 
> c06b6d70feb3 xen/x86: don't lose event interrupts
> 
> Which landed in Linux 4.7.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Acked-by: Henry Wang <Henry.Wang@xxxxxxx> # CHANGELOG

Kind regards,
Henry

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.