[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCHv1] xen/events/fifo: Handle linked events when closing a PIRQ port



On 16/09/15 16:26, Boris Ostrovsky wrote:
> On 08/10/2015 01:16 PM, David Vrabel wrote:
>> On 10/08/15 17:47, linux@xxxxxxxxxxxxxx wrote:
>>> On 2015-08-10 16:24, David Vrabel wrote:
>>>> Commit fcdf31a7c162de0c93a2bee51df4688ab0a348f8 (xen/events/fifo:
>>>> Handle linked events when closing a port) did not handle closing a
>>>> port bound to a PIRQ because these are closed from shutdown_pirq()
>>>> which is called with interrupts disabled.
>>>>
>>>> Defer the close to a work queue where we can safely spin waiting for
>>>> the LINKED bit to clear.  For simplicity, the close is always deferred
>>>> even if it is not required (i.e., we're already in process context).
>>>>
>>>> Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
>>>> Cc: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
>>>> ---
>>>> Cc: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
>>> Hi David,
>>>
>>> Tested your patch, don't know for sure but this doesn't seem to work
>>> out.
>>> I end up with this event channel error on dom0 boot.
>>>
>>> Which ends in state:
>>> Name                                        ID   Mem VCPUs    State
>>> Time(s)
>>> (null)                                       0  1536     6     r-----
>>>   183.8
>>>
>>> -- 
>>> Sander
>>>
>>> (XEN) [2015-08-10 16:35:34.584] PCI add device 0000:0d:00.0
>>> (XEN) [2015-08-10 16:35:34.891] PCI add device 0000:0c:00.0
>>> (XEN) [2015-08-10 16:35:35.123] PCI add device 0000:0b:00.0
>>> (XEN) [2015-08-10 16:35:35.325] PCI add device 0000:0a:00.0
>>> (XEN) [2015-08-10 16:35:35.574] PCI add device 0000:09:00.0
>>> (XEN) [2015-08-10 16:35:35.642] PCI add device 0000:09:00.1
>>> (XEN) [2015-08-10 16:35:35.872] PCI add device 0000:05:00.0
>>> (XEN) [2015-08-10 16:35:36.044] PCI add device 0000:06:01.0
>>> (XEN) [2015-08-10 16:35:36.109] PCI add device 0000:06:02.0
>>> (XEN) [2015-08-10 16:35:36.293] PCI add device 0000:08:00.0
>>> (XEN) [2015-08-10 16:35:36.603] PCI add device 0000:07:00.0
>>> (XEN) [2015-08-10 16:35:36.906] PCI add device 0000:04:00.0
>>> (XEN) [2015-08-10 16:35:37.074] PCI add device 0000:03:06.0
>>> (XEN) [2015-08-10 16:35:39.456] PCI: Using MCFG for segment 0000 bus
>>> 00-ff
>>> (XEN) [2015-08-10 16:35:49.623] d0: Forcing read-only access to MFN
>>> fed00
>>> (XEN) [2015-08-10 16:35:51.374] event_channel.c:472:d0v0 EVTCHNOP
>>> failure: error -17
>> This didn't happen on the test box I used but I can see it is possible
>> to rebind a PIRQ whose close is still deferred.
>>
>> I'm going to revert fcdf31a7c162de0c93a2bee51df4688ab0a348f8
>> (xen/events/fifo: Handle linked events when closing a port) for now.
> 
> 
> Any updates on these two patches? We started seeing this problem (stale
> events) in our testing when onlining/offlining vcpus in heavily
> oversubscribed guests.

This is the last attempt I came up with -- using a tasklet to wait for the
event to be unlinked.  I was worried that that tasklets could be punted
to the ksoftirqd threads but haven't investigated this or come
up with something else (perhaps a high priority tasklet instead?).

David

8<------------------------------
From 2aab05f7535e4c6a2dae1472c16fbb70d1aadab0 Mon Sep 17 00:00:00 2001
From: David Vrabel <david.vrabel@xxxxxxxxxx>
Date: Thu, 13 Aug 2015 14:57:50 +0100
Subject: [PATCH] xen/events/fifo: Handle linked events when closing a port

An event channel bound to a CPU that was offlined may still be linked
on that CPU's queue.  If this event channel is closed and reused,
subsequent events will be lost because the event channel is never
unlinked and thus cannot be linked onto the correct queue.

When a channel is closed and the event is still linked into a queue,
ensure that it is unlinked before completing.

If the CPU to which the event channel bound is online, spin until the
event is handled by that CPU. If that CPU is offline, it can't handle
the event, so clear the event queue during the close, dropping the
events.

This fixes the missing interrupts (and subsequent disk stalls etc.)
when offlining a CPU.

Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
---
Cc: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
Cc: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>

v3:
- use a tasklet since work may be delayed enough that a driver tries
  to rebind a PIRQ that hasn't been closed yet.
---
 drivers/xen/events/events_2l.c       | 10 ++++
 drivers/xen/events/events_base.c     |  9 ----
 drivers/xen/events/events_fifo.c     | 93 ++++++++++++++++++++++++++++++++++--
 drivers/xen/events/events_internal.h |  6 +++
 4 files changed, 104 insertions(+), 14 deletions(-)

diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c
index 7dd4631..e00bb7b 100644
--- a/drivers/xen/events/events_2l.c
+++ b/drivers/xen/events/events_2l.c
@@ -354,6 +354,15 @@ static void evtchn_2l_resume(void)
                                EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD);
 }
 
+static void evtchn_2l_close(unsigned int port)
+{
+       struct evtchn_close close;
+
+       close.port = port;
+       if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+               BUG();
+}
+
 static const struct evtchn_ops evtchn_ops_2l = {
        .max_channels      = evtchn_2l_max_channels,
        .nr_channels       = evtchn_2l_max_channels,
@@ -366,6 +375,7 @@ static const struct evtchn_ops evtchn_ops_2l = {
        .unmask            = evtchn_2l_unmask,
        .handle_events     = evtchn_2l_handle_events,
        .resume            = evtchn_2l_resume,
+       .close             = evtchn_2l_close,
 };
 
 void __init xen_evtchn_2l_init(void)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 96093ae..b4d2e14 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -452,15 +452,6 @@ static void xen_free_irq(unsigned irq)
        irq_free_desc(irq);
 }
 
-static void xen_evtchn_close(unsigned int port)
-{
-       struct evtchn_close close;
-
-       close.port = port;
-       if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
-               BUG();
-}
-
 static void pirq_query_unmask(int irq)
 {
        struct physdev_irq_status_query irq_status;
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index ed673e1..e0ed185 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -40,6 +40,7 @@
 #include <linux/smp.h>
 #include <linux/percpu.h>
 #include <linux/cpu.h>
+#include <linux/slab.h>
 
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
@@ -255,6 +256,12 @@ static void evtchn_fifo_unmask(unsigned port)
        }
 }
 
+static bool evtchn_fifo_is_linked(unsigned port)
+{
+       event_word_t *word = event_word_from_port(port);
+       return sync_test_bit(EVTCHN_FIFO_BIT(LINKED, word), BM(word));
+}
+
 static uint32_t clear_linked(volatile event_word_t *word)
 {
        event_word_t new, old, w;
@@ -281,7 +288,8 @@ static void handle_irq_for_port(unsigned port)
 
 static void consume_one_event(unsigned cpu,
                              struct evtchn_fifo_control_block *control_block,
-                             unsigned priority, unsigned long *ready)
+                             unsigned priority, unsigned long *ready,
+                             bool drop)
 {
        struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu);
        uint32_t head;
@@ -313,13 +321,15 @@ static void consume_one_event(unsigned cpu,
        if (head == 0)
                clear_bit(priority, ready);
 
-       if (evtchn_fifo_is_pending(port) && !evtchn_fifo_is_masked(port))
-               handle_irq_for_port(port);
+       if (evtchn_fifo_is_pending(port) && !evtchn_fifo_is_masked(port)) {
+               if (likely(!drop))
+                       handle_irq_for_port(port);
+       }
 
        q->head[priority] = head;
 }
 
-static void evtchn_fifo_handle_events(unsigned cpu)
+static void __evtchn_fifo_handle_events(unsigned cpu, bool drop)
 {
        struct evtchn_fifo_control_block *control_block;
        unsigned long ready;
@@ -331,11 +341,16 @@ static void evtchn_fifo_handle_events(unsigned cpu)
 
        while (ready) {
                q = find_first_bit(&ready, EVTCHN_FIFO_MAX_QUEUES);
-               consume_one_event(cpu, control_block, q, &ready);
+               consume_one_event(cpu, control_block, q, &ready, drop);
                ready |= xchg(&control_block->ready, 0);
        }
 }
 
+static void evtchn_fifo_handle_events(unsigned cpu)
+{
+       __evtchn_fifo_handle_events(cpu, false);
+}
+
 static void evtchn_fifo_resume(void)
 {
        unsigned cpu;
@@ -371,6 +386,40 @@ static void evtchn_fifo_resume(void)
        event_array_pages = 0;
 }
 
+static DEFINE_PER_CPU(struct tasklet_struct, close_tasklets);
+
+static void __evtchn_fifo_close(unsigned long port)
+{
+       struct evtchn_close close;
+
+       while (evtchn_fifo_is_linked(port))
+               cpu_relax();
+
+       close.port = port;
+       if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+               BUG();
+}
+
+static void evtchn_fifo_close(unsigned port)
+{
+       /*
+        * A port cannot be closed until the LINKED bit is clear.
+        *
+        * Reusing an already linked event may: a) cause the new event
+        * to be raised on the wrong VCPU; or b) cause the event to be
+        * lost (if the old VCPU is offline).
+        */
+
+       if (irqs_disabled()) {
+               struct tasklet_struct *tasklet;
+
+               tasklet = this_cpu_ptr(&close_tasklets);
+               tasklet_init(tasklet, __evtchn_fifo_close, port);
+               tasklet_schedule(tasklet);
+       } else
+               __evtchn_fifo_close(port);
+}
+
 static const struct evtchn_ops evtchn_ops_fifo = {
        .max_channels      = evtchn_fifo_max_channels,
        .nr_channels       = evtchn_fifo_nr_channels,
@@ -384,6 +433,7 @@ static const struct evtchn_ops evtchn_ops_fifo = {
        .unmask            = evtchn_fifo_unmask,
        .handle_events     = evtchn_fifo_handle_events,
        .resume            = evtchn_fifo_resume,
+       .close             = evtchn_fifo_close,
 };
 
 static int evtchn_fifo_alloc_control_block(unsigned cpu)
@@ -408,6 +458,36 @@ static int evtchn_fifo_alloc_control_block(unsigned cpu)
        return ret;
 }
 
+
+static void evtchn_fifo_drain_queues(int cpu)
+{
+       int irq;
+       struct irq_desc *desc;
+
+       for_each_irq_desc(irq, desc) {
+               const struct cpumask *affinity;
+               struct irq_data *data;
+               struct irq_chip *chip;
+
+               /* interrupt's are disabled at this point */
+               raw_spin_lock(&desc->lock);
+
+               data = irq_desc_get_irq_data(desc);
+               affinity = data->affinity;
+               if (cpumask_subset(affinity, cpu_online_mask)) {
+                       raw_spin_unlock(&desc->lock);
+                       continue;
+               }
+
+               chip = irq_data_get_irq_chip(data);
+               if (chip->irq_mask)
+                       chip->irq_mask(data);
+
+               raw_spin_unlock(&desc->lock);
+       }
+       __evtchn_fifo_handle_events(cpu, true);
+}
+
 static int evtchn_fifo_cpu_notification(struct notifier_block *self,
                                                  unsigned long action,
                                                  void *hcpu)
@@ -420,6 +500,9 @@ static int evtchn_fifo_cpu_notification(struct 
notifier_block *self,
                if (!per_cpu(cpu_control_block, cpu))
                        ret = evtchn_fifo_alloc_control_block(cpu);
                break;
+       case CPU_DYING:
+               evtchn_fifo_drain_queues(cpu);
+               break;
        default:
                break;
        }
diff --git a/drivers/xen/events/events_internal.h 
b/drivers/xen/events/events_internal.h
index 50c2050a..3f493f2 100644
--- a/drivers/xen/events/events_internal.h
+++ b/drivers/xen/events/events_internal.h
@@ -68,6 +68,7 @@ struct evtchn_ops {
        bool (*test_and_set_mask)(unsigned port);
        void (*mask)(unsigned port);
        void (*unmask)(unsigned port);
+       void (*close)(unsigned port);
 
        void (*handle_events)(unsigned cpu);
        void (*resume)(void);
@@ -145,6 +146,11 @@ static inline void xen_evtchn_resume(void)
                evtchn_ops->resume();
 }
 
+static inline void xen_evtchn_close(unsigned port)
+{
+       evtchn_ops->close(port);
+}
+
 void xen_evtchn_2l_init(void);
 int xen_evtchn_fifo_init(void);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.