Xen project Mailing List

Re: [RFC] Xen crashes on ASSERT on suspend/resume, suggested fix

To: Stefano Stabellini <stefano.stabellini@xxxxxxx>

Date: Thu, 25 May 2023 13:30:53 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ctiJ9TLrZSYG2xXmKp4mHAkJBWrEYJzMaVxtsSmLrDg=; b=SEnqbgYhpltP96T+tD3xpGRY46hx5cHa4wX4FmyOEqQFioU7UoZfgxCMPHBuibRo2V9u+YZOANPXt43gCbBYSuWCj15bZUbkM3Q/kpsNp3Ve3t18g3GH59BZC4GjFOtNRxeRt+Nb0j+fa2KYSECVZbreADVHcUk/liA5cbelLeHZg6Ebcfu9ieAi358op0FqBnG4Z9vZHoCT7mzpmqjEBrKVHUxGqe8LSR/tB0mH8VIZYdCXJaVDPQWdnTqlDJkfk4jK358sLMIkghADPGhIdU/59IRuRgjqPcXEl+GPLYNu9a8dcI7FCvLAuJRKJvcLqiccyqk4w1nFz1+K77hsKQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QSSjR4MfMBZzjWrFClmhnAYK1vlbR7fK5JjzctnArrpl8K1k/fJmCK/2BUdaZOvGgNcJAPer65ZvNmdHACg1/ETkCHmLrrNZcRTzB5sTFF1FwAhd5s4eiY6IIIo+G9IM4/GPArSbGY5mXQlyK16kaSa3bF1gtYTMfUaJfQDFlgFXkjA/MNcQehV8eRQYtJ+Ei++cP0OAe4rbvnXP45/dS+SoK9L1MMPdgX+hS5Aa2CIZrPD6dkjRARytdLC2hZvXi655CbdTLqs9wKlNKC3QT7Dut+uG5AquCW+vL+/OQ28oEF4p/Dk2mmBItJH43OaQHsGfHadBmBbw+7I3Cc/B4g==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, xenia.ragiadakou@xxxxxxx, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Thu, 25 May 2023 11:31:18 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25.05.2023 01:51, Stefano Stabellini wrote: > xen/irq: fix races between send_cleanup_vector and _clear_irq_vector This title is, I'm afraid, already misleading. No such race can occur afaict, as both callers of _clear_irq_vector() acquire the IRQ descriptor lock first, and irq_complete_move() (the sole caller of send_cleanup_vector()) is only ever invoked as or by an ->ack() hook, which in turn is only invoked with, again, the descriptor lock held. > It is possible that send_cleanup_vector and _clear_irq_vector are > running at the same time on different CPUs. In that case we have a race > as both _clear_irq_vector and irq_move_cleanup_interrupt are trying to > clear old_vector. > > This patch fixes 3 races: > > 1) As irq_move_cleanup_interrupt is running on multiple CPUs at the > same time, and also _clear_irq_vector is running, it is possible that > only some per_cpu(vector_irq, cpu)[old_vector] are valid but not all. > So, turn the ASSERT in _clear_irq_vector into an if. Note again the locking which is in effect. > 2) It is possible that _clear_irq_vector is running at the same time as > release_old_vec, called from irq_move_cleanup_interrupt. At the moment, > it is possible for _clear_irq_vector to read a valid old_cpu_mask but an > invalid old_vector (because it is being set to invalid by > release_old_vec). To avoid this problem in release_old_vec move clearing > old_cpu_mask before setting old_vector to invalid. This way, we know that > in _clear_irq_vector if old_vector is invalid also old_cpu_mask is zero > and we don't enter the loop. All invocations of release_old_vec() are similarly inside suitably locked regions. > 3) It is possible that release_old_vec is running twice at the same time > for the same old_vector. Change the code in release_old_vec to make it > OK to call it twice. Remove both ASSERTs. With those gone, it should be > possible now to call release_old_vec twice in a row for the same > old_vector. Same here. Any such issues would surface more frequently and without any suspend / resume involved. What is still missing is that connection, and only then it'll (or really: may) become clear what needs adjusting. If you've seen the issue exactly once, then I'm afraid there's not much we can do unless someone can come up with a plausible explanation of something being broken on any of the involved code paths. More information will need to be gathered out of the next occurrence of this, whenever that's going to be. One of the things we will want to know, as mentioned before, is the value that per_cpu(vector_irq, cpu)[old_vector] has when the assertion triggers. Iirc Roger did suggest another piece of data you'd want to log. > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -112,16 +112,11 @@ static void release_old_vec(struct irq_desc *desc) > { > unsigned int vector = desc->arch.old_vector; > > - desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; > cpumask_clear(desc->arch.old_cpu_mask); > + desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; > > - if ( !valid_irq_vector(vector) ) > - ASSERT_UNREACHABLE(); > - else if ( desc->arch.used_vectors ) > - { > - ASSERT(test_bit(vector, desc->arch.used_vectors)); > + if ( desc->arch.used_vectors ) > clear_bit(vector, desc->arch.used_vectors); > - } > } > > static void _trace_irq_mask(uint32_t event, int irq, int vector, > @@ -230,9 +225,11 @@ static void _clear_irq_vector(struct irq_desc *desc) > > for_each_cpu(cpu, tmp_mask) > { > - ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq); > - TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu); > - per_cpu(vector_irq, cpu)[old_vector] = ~irq; > + if ( per_cpu(vector_irq, cpu)[old_vector] == irq ) > + { > + TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu); > + per_cpu(vector_irq, cpu)[old_vector] = ~irq; > + } > } As said before - replacing ASSERT() by a respective if() cannot really be done without discussing the "else" in the description. Except of course in trivial/obvious cases, but I think we agree here we don't have such a case. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.