[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Lockdep show 6.6-rc regression in Xen HVM CPU hotplug
On Tue, 2023-10-24 at 14:08 +0200, Juergen Gross wrote: > On 24.10.23 12:41, Juergen Gross wrote: > > On 24.10.23 09:43, David Woodhouse wrote: > > > On Tue, 2023-10-24 at 08:53 +0200, Juergen Gross wrote: > > > > > > > > I'm puzzled. This path doesn't contain any of the RCU usage I've added > > > > in > > > > commit 87797fad6cce. > > > > > > > > Are you sure that with just reverting commit 87797fad6cce the issue > > > > doesn't > > > > manifest anymore? I'd rather expect commit 721255b9826b having caused > > > > this > > > > behavior, just telling from the messages above. > > > > > > Retesting in the cold light of day, yes. Using v6.6-rc5 which is the > > > parent commit of the offending 87797fad6cce. > > > > > > I now see this warning at boot time again, which I believe was an > > > aspect of what you were trying to fix: > > > > > > [ 0.059014] xen:events: Using FIFO-based ABI > > > [ 0.059029] xen:events: Xen HVM callback vector for event delivery is > > > enabled > > > [ 0.059227] rcu: srcu_init: Setting srcu_struct sizes based on > > > contention. > > > [ 0.059296] > > > [ 0.059297] ============================= > > > [ 0.059298] [ BUG: Invalid wait context ] > > > [ 0.059299] 6.6.0-rc5 #1374 Not tainted > > > [ 0.059300] ----------------------------- > > > [ 0.059301] swapper/0/0 is trying to lock: > > > [ 0.059303] ffffffff8ad595f8 (evtchn_rwlock){....}-{3:3}, at: > > > xen_evtchn_do_upcall+0x59/0xd0 > > > > Indeed. > > > > What I still not get is why the rcu_dereference_check() splat isn't > > happening without my patch. > > > > IMHO it should be related to the fact that cpuhp_report_idle_dead() > > is trying to send an IPI via xen_send_IPI_one(), which is using > > notify_remote_via_irq(), which in turn needs to call irq_get_chip_data(). > > This is using the maple-tree since 721255b9826b, which is using > > rcu_read_lock(). > > > > I can probably change xen_send_IPI_one() to not need irq_get_chip_data(). > > David, could you test the attached patch, please? Build tested only. Tested-by: David Woodhouse <dwmw@xxxxxxxxxxxx> (And I think we worked out the reason why your 'replace evtchn_rwlock with RCU' patch apparently triggers those other issues is that *without* your patch, lockdep fired off a warning and then stopped working long before the other issues occur.) Attachment:
smime.p7s
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |