[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] Xen crashes on ASSERT on suspend/resume, suggested fix


  • To: Stefano Stabellini <stefano.stabellini@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 25 May 2023 13:30:53 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ctiJ9TLrZSYG2xXmKp4mHAkJBWrEYJzMaVxtsSmLrDg=; b=SEnqbgYhpltP96T+tD3xpGRY46hx5cHa4wX4FmyOEqQFioU7UoZfgxCMPHBuibRo2V9u+YZOANPXt43gCbBYSuWCj15bZUbkM3Q/kpsNp3Ve3t18g3GH59BZC4GjFOtNRxeRt+Nb0j+fa2KYSECVZbreADVHcUk/liA5cbelLeHZg6Ebcfu9ieAi358op0FqBnG4Z9vZHoCT7mzpmqjEBrKVHUxGqe8LSR/tB0mH8VIZYdCXJaVDPQWdnTqlDJkfk4jK358sLMIkghADPGhIdU/59IRuRgjqPcXEl+GPLYNu9a8dcI7FCvLAuJRKJvcLqiccyqk4w1nFz1+K77hsKQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QSSjR4MfMBZzjWrFClmhnAYK1vlbR7fK5JjzctnArrpl8K1k/fJmCK/2BUdaZOvGgNcJAPer65ZvNmdHACg1/ETkCHmLrrNZcRTzB5sTFF1FwAhd5s4eiY6IIIo+G9IM4/GPArSbGY5mXQlyK16kaSa3bF1gtYTMfUaJfQDFlgFXkjA/MNcQehV8eRQYtJ+Ei++cP0OAe4rbvnXP45/dS+SoK9L1MMPdgX+hS5Aa2CIZrPD6dkjRARytdLC2hZvXi655CbdTLqs9wKlNKC3QT7Dut+uG5AquCW+vL+/OQ28oEF4p/Dk2mmBItJH43OaQHsGfHadBmBbw+7I3Cc/B4g==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, xenia.ragiadakou@xxxxxxx, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Thu, 25 May 2023 11:31:18 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25.05.2023 01:51, Stefano Stabellini wrote:
> xen/irq: fix races between send_cleanup_vector and _clear_irq_vector

This title is, I'm afraid, already misleading. No such race can occur
afaict, as both callers of _clear_irq_vector() acquire the IRQ
descriptor lock first, and irq_complete_move() (the sole caller of
send_cleanup_vector()) is only ever invoked as or by an ->ack()
hook, which in turn is only invoked with, again, the descriptor lock
held.

> It is possible that send_cleanup_vector and _clear_irq_vector are
> running at the same time on different CPUs. In that case we have a race
> as both _clear_irq_vector and irq_move_cleanup_interrupt are trying to
> clear old_vector.
> 
> This patch fixes 3 races:
> 
> 1) As irq_move_cleanup_interrupt is running on multiple CPUs at the
> same time, and also _clear_irq_vector is running, it is possible that
> only some per_cpu(vector_irq, cpu)[old_vector] are valid but not all.
> So, turn the ASSERT in _clear_irq_vector into an if.

Note again the locking which is in effect.

> 2) It is possible that _clear_irq_vector is running at the same time as
> release_old_vec, called from irq_move_cleanup_interrupt. At the moment,
> it is possible for _clear_irq_vector to read a valid old_cpu_mask but an
> invalid old_vector (because it is being set to invalid by
> release_old_vec). To avoid this problem in release_old_vec move clearing
> old_cpu_mask before setting old_vector to invalid. This way, we know that
> in _clear_irq_vector if old_vector is invalid also old_cpu_mask is zero
> and we don't enter the loop.

All invocations of release_old_vec() are similarly inside suitably
locked regions.

> 3) It is possible that release_old_vec is running twice at the same time
> for the same old_vector. Change the code in release_old_vec to make it
> OK to call it twice. Remove both ASSERTs. With those gone, it should be
> possible now to call release_old_vec twice in a row for the same
> old_vector.

Same here.

Any such issues would surface more frequently and without any suspend /
resume involved. What is still missing is that connection, and only then
it'll (or really: may) become clear what needs adjusting. If you've seen
the issue exactly once, then I'm afraid there's not much we can do unless
someone can come up with a plausible explanation of something being
broken on any of the involved code paths. More information will need to
be gathered out of the next occurrence of this, whenever that's going to
be. One of the things we will want to know, as mentioned before, is the
value that per_cpu(vector_irq, cpu)[old_vector] has when the assertion
triggers. Iirc Roger did suggest another piece of data you'd want to log.

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -112,16 +112,11 @@ static void release_old_vec(struct irq_desc *desc)
>  {
>      unsigned int vector = desc->arch.old_vector;
>  
> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>      cpumask_clear(desc->arch.old_cpu_mask);
> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
>  
> -    if ( !valid_irq_vector(vector) )
> -        ASSERT_UNREACHABLE();
> -    else if ( desc->arch.used_vectors )
> -    {
> -        ASSERT(test_bit(vector, desc->arch.used_vectors));
> +    if ( desc->arch.used_vectors )
>          clear_bit(vector, desc->arch.used_vectors);
> -    }
>  }
>  
>  static void _trace_irq_mask(uint32_t event, int irq, int vector,
> @@ -230,9 +225,11 @@ static void _clear_irq_vector(struct irq_desc *desc)
>  
>          for_each_cpu(cpu, tmp_mask)
>          {
> -            ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
> -            TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
> -            per_cpu(vector_irq, cpu)[old_vector] = ~irq;
> +            if ( per_cpu(vector_irq, cpu)[old_vector] == irq )
> +            {
> +                TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
> +                per_cpu(vector_irq, cpu)[old_vector] = ~irq;
> +            }
>          }

As said before - replacing ASSERT() by a respective if() cannot really
be done without discussing the "else" in the description. Except of
course in trivial/obvious cases, but I think we agree here we don't
have such a case.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.