[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs



Hi Jan,

On 3/10/20 3:19 PM, Jan Beulich wrote:
On 09.03.2020 18:47, Paul Durrant wrote:
-----Original Message-----
From: Jan Beulich <jbeulich@xxxxxxxx>
Sent: 09 March 2020 16:29
To: paul@xxxxxxx
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Varad Gautam <vrd@xxxxxxxxx>; Julien Grall 
<julien@xxxxxxx>; Roger
Pau Monné <roger.pau@xxxxxxxxxx>; Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for 
shared pirqs

On 06.03.2020 17:02, paul@xxxxxxx wrote:
From: Varad Gautam <vrd@xxxxxxxxx>

XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple __pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
   domain_kill()
     -> domain_relinquish_resources()
       -> pci_release_devices()
         -> pci_clean_dpci_irq()
           -> pirq_guest_unbind()
             -> __pirq_guest_unbind()

For a shared pirq (nr_guests > 1), the first call would zap the current
domain from the pirq's guests[] list, but the action handler is never freed
as there are other guests using this pirq. As a result, on the second call,
__pirq_guest_unbind searches for the current domain which has been removed
from the guests[] list, and hits a BUG_ON.

Make __pirq_guest_unbind safe to be called multiple times by letting xen
continue if a shared pirq has already been unbound from this guest. The
PIRQ will be cleaned up from the domain's pirq_tree during the destruction
in complete_domain_destroy anyway.

Signed-off-by: Varad Gautam <vrd@xxxxxxxxx>
[taking over from Varad at v4]
Signed-off-by: Paul Durrant <paul@xxxxxxx>
---
Cc: Jan Beulich <jbeulich@xxxxxxxx>
Cc: Julien Grall <julien@xxxxxxx>
Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>
Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Roger suggested cleaning the entry from the domain pirq_tree so that
we need not make it safe to re-call __pirq_guest_unbind(). This seems like
a reasonable suggestion but the semantics of the code are almost
impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
the name of struct so you generally have little idea what it actally means)
so I prefer to stick with a small fix that I can actually reason about.

v4:
  - Re-work the guest array search to make it clearer
I.e. there are cosmetic differences to v3 (see below), but
technically it's still the same. I can't believe the re-use
of "pirq" for different entities is this big of a problem.
Please suggest code if you think it ought to be done differentely. I tried.
How about this? It's admittedly more code, but imo less ad hoc.
I've smoke tested it, but I depend on you or Varad to check that
it actually addresses the reported issue.

Jan

x86/pass-through: avoid double IRQ unbind during domain cleanup


I have tested that this patch prevents __pirq_guest_unbind on an already-unbound pirq during the continuation call for domain_kill -ERESTART, by using a modified xen that forces an -ERESTART from pirq_guest_unbind to create the continuation. It fixes the
underlying issue.

Tested-by: Varad Gautam <vrd@xxxxxxxxx>



XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple _pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
   domain_kill()
     -> domain_relinquish_resources()
       -> pci_release_devices()
         -> pci_clean_dpci_irq()
           -> pirq_guest_unbind()
             -> __pirq_guest_unbind()

Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
from the tree being iterated after the first call there. In case such a
removed entry still has a softirq outstanding, record it and re-check
upon re-invocation.

Reported-by: Varad Gautam <vrd@xxxxxxxxx>
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

--- unstable.orig/xen/arch/x86/irq.c
+++ unstable/xen/arch/x86/irq.c
@@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
      }

      if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
-        BUG();
+        BUG_ON(!d->is_dying);
  }

  /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
--- unstable.orig/xen/drivers/passthrough/pci.c
+++ unstable/xen/drivers/passthrough/pci.c
@@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
          xfree(digl);
      }

-    return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
+    radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
+
+    if ( !pt_pirq_softirq_active(pirq_dpci) )
+        return 0;
+
+    domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
+
+    return -ERESTART;
  }

  static int pci_clean_dpci_irqs(struct domain *d)
@@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
      hvm_irq_dpci = domain_get_irq_dpci(d);
      if ( hvm_irq_dpci != NULL )
      {
-        int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
+        int ret = 0;
+
+        if ( hvm_irq_dpci->pending_pirq_dpci )
+        {
+            if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
+                 ret = -ERESTART;
+            else
+                 hvm_irq_dpci->pending_pirq_dpci = NULL;
+        }

+        if ( !ret )
+            ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
          if ( ret )
          {
              spin_unlock(&d->event_lock);
--- unstable.orig/xen/include/asm-x86/hvm/irq.h
+++ unstable/xen/include/asm-x86/hvm/irq.h
@@ -158,6 +158,8 @@ struct hvm_irq_dpci {
      DECLARE_BITMAP(isairq_map, NR_ISAIRQS);
      /* Record of mapped Links */
      uint8_t link_cnt[NR_LINK];
+    /* Clean up: Entry with a softirq invocation pending / in progress. */
+    struct hvm_pirq_dpci *pending_pirq_dpci;
  };

  /* Machine IRQ to guest device/intx mapping. */






Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.