[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Lockdep show 6.6-rc regression in Xen HVM CPU hotplug


  • To: David Woodhouse <dwmw2@xxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, paulmck <paulmck@xxxxxxxxxx>
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Tue, 24 Oct 2023 12:41:05 +0200
  • Authentication-results: smtp-out2.suse.de; none
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>
  • Delivery-date: Tue, 24 Oct 2023 10:41:16 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 24.10.23 09:43, David Woodhouse wrote:
On Tue, 2023-10-24 at 08:53 +0200, Juergen Gross wrote:

I'm puzzled. This path doesn't contain any of the RCU usage I've added in
commit 87797fad6cce.

Are you sure that with just reverting commit 87797fad6cce the issue doesn't
manifest anymore? I'd rather expect commit 721255b9826b having caused this
behavior, just telling from the messages above.

Retesting in the cold light of day, yes. Using v6.6-rc5 which is the
parent commit of the offending 87797fad6cce.

I now see this warning at boot time again, which I believe was an
aspect of what you were trying to fix:

[    0.059014] xen:events: Using FIFO-based ABI
[    0.059029] xen:events: Xen HVM callback vector for event delivery is enabled
[    0.059227] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.059296]
[    0.059297] =============================
[    0.059298] [ BUG: Invalid wait context ]
[    0.059299] 6.6.0-rc5 #1374 Not tainted
[    0.059300] -----------------------------
[    0.059301] swapper/0/0 is trying to lock:
[    0.059303] ffffffff8ad595f8 (evtchn_rwlock){....}-{3:3}, at: 
xen_evtchn_do_upcall+0x59/0xd0

Indeed.

What I still not get is why the rcu_dereference_check() splat isn't
happening without my patch.

IMHO it should be related to the fact that cpuhp_report_idle_dead()
is trying to send an IPI via xen_send_IPI_one(), which is using
notify_remote_via_irq(), which in turn needs to call irq_get_chip_data().
This is using the maple-tree since 721255b9826b, which is using
rcu_read_lock().

I can probably change xen_send_IPI_one() to not need irq_get_chip_data().
But I'd like to understand why my patch causes the problem to surface
only now, instead of having been prominent since commit 721255b9826b.

Paul, do you have an explanation for the splat only coming out now?


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.