[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs



> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 10 March 2020 14:19
> To: paul@xxxxxxx
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; 'Varad Gautam' <vrd@xxxxxxxxx>; 'Julien 
> Grall' <julien@xxxxxxx>;
> 'Roger Pau Monné' <roger.pau@xxxxxxxxxx>; 'Andrew Cooper' 
> <andrew.cooper3@xxxxxxxxxx>
> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for 
> shared pirqs
> 
> On 09.03.2020 18:47, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@xxxxxxxx>
> >> Sent: 09 March 2020 16:29
> >> To: paul@xxxxxxx
> >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Varad Gautam <vrd@xxxxxxxxx>; Julien 
> >> Grall <julien@xxxxxxx>;
> Roger
> >> Pau Monné <roger.pau@xxxxxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >> Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for 
> >> shared pirqs
> >>
> >> On 06.03.2020 17:02, paul@xxxxxxx wrote:
> >>> From: Varad Gautam <vrd@xxxxxxxxx>
> >>>
> >>> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> >>> In that scenario, it is possible to receive multiple __pirq_guest_unbind
> >>> calls for the same pirq from domain_kill, if the pirq has not yet been
> >>> removed from the domain's pirq_tree, as:
> >>>   domain_kill()
> >>>     -> domain_relinquish_resources()
> >>>       -> pci_release_devices()
> >>>         -> pci_clean_dpci_irq()
> >>>           -> pirq_guest_unbind()
> >>>             -> __pirq_guest_unbind()
> >>>
> >>> For a shared pirq (nr_guests > 1), the first call would zap the current
> >>> domain from the pirq's guests[] list, but the action handler is never 
> >>> freed
> >>> as there are other guests using this pirq. As a result, on the second 
> >>> call,
> >>> __pirq_guest_unbind searches for the current domain which has been removed
> >>> from the guests[] list, and hits a BUG_ON.
> >>>
> >>> Make __pirq_guest_unbind safe to be called multiple times by letting xen
> >>> continue if a shared pirq has already been unbound from this guest. The
> >>> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
> >>> in complete_domain_destroy anyway.
> >>>
> >>> Signed-off-by: Varad Gautam <vrd@xxxxxxxxx>
> >>> [taking over from Varad at v4]
> >>> Signed-off-by: Paul Durrant <paul@xxxxxxx>
> >>> ---
> >>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> >>> Cc: Julien Grall <julien@xxxxxxx>
> >>> Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >>>
> >>> Roger suggested cleaning the entry from the domain pirq_tree so that
> >>> we need not make it safe to re-call __pirq_guest_unbind(). This seems like
> >>> a reasonable suggestion but the semantics of the code are almost
> >>> impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
> >>> the name of struct so you generally have little idea what it actally 
> >>> means)
> >>> so I prefer to stick with a small fix that I can actually reason about.
> >>>
> >>> v4:
> >>>  - Re-work the guest array search to make it clearer
> >>
> >> I.e. there are cosmetic differences to v3 (see below), but
> >> technically it's still the same. I can't believe the re-use
> >> of "pirq" for different entities is this big of a problem.
> >
> > Please suggest code if you think it ought to be done differentely. I tried.
> 
> How about this? It's admittedly more code, but imo less ad hoc.
> I've smoke tested it, but I depend on you or Varad to check that
> it actually addresses the reported issue.

It's fairly hard to hit IIRC but we could probably engineer it with a one off 
ERESTART in the right place.

> 
> Jan
> 
> x86/pass-through: avoid double IRQ unbind during domain cleanup
> 
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
>   domain_kill()
>     -> domain_relinquish_resources()
>       -> pci_release_devices()
>         -> pci_clean_dpci_irq()
>           -> pirq_guest_unbind()
>             -> __pirq_guest_unbind()
> 
> Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
> from the tree being iterated after the first call there. In case such a
> removed entry still has a softirq outstanding, record it and re-check
> upon re-invocation.
> 
> Reported-by: Varad Gautam <vrd@xxxxxxxxx>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> --- unstable.orig/xen/arch/x86/irq.c
> +++ unstable/xen/arch/x86/irq.c
> @@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
>      }
> 
>      if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
> -        BUG();
> +        BUG_ON(!d->is_dying);
>  }
> 
>  /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
> --- unstable.orig/xen/drivers/passthrough/pci.c
> +++ unstable/xen/drivers/passthrough/pci.c
> @@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
>          xfree(digl);
>      }
> 
> -    return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
> +    radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
> +
> +    if ( !pt_pirq_softirq_active(pirq_dpci) )
> +        return 0;
> +
> +    domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
> +
> +    return -ERESTART;
>  }
> 
>  static int pci_clean_dpci_irqs(struct domain *d)
> @@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
>      hvm_irq_dpci = domain_get_irq_dpci(d);
>      if ( hvm_irq_dpci != NULL )
>      {
> -        int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
> +        int ret = 0;
> +
> +        if ( hvm_irq_dpci->pending_pirq_dpci )
> +        {
> +            if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
> +                 ret = -ERESTART;
> +            else
> +                 hvm_irq_dpci->pending_pirq_dpci = NULL;
> +        }
> 
> +        if ( !ret )
> +            ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
>          if ( ret )
>          {
>              spin_unlock(&d->event_lock);
> --- unstable.orig/xen/include/asm-x86/hvm/irq.h
> +++ unstable/xen/include/asm-x86/hvm/irq.h
> @@ -158,6 +158,8 @@ struct hvm_irq_dpci {
>      DECLARE_BITMAP(isairq_map, NR_ISAIRQS);
>      /* Record of mapped Links */
>      uint8_t link_cnt[NR_LINK];
> +    /* Clean up: Entry with a softirq invocation pending / in progress. */
> +    struct hvm_pirq_dpci *pending_pirq_dpci;
>  };
> 

That looks like it will do the job. I'll see if I can get it tested.

  Paul

>  /* Machine IRQ to guest device/intx mapping. */
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.