|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 for-xen-4.5 1/3] dpci: Move from domain centric model to hvm_dirq_dpci model.
On Mon, Sep 29, 2014 at 08:21:53AM +0100, Jan Beulich wrote:
> >>> On 27.09.14 at 03:32, <konrad.wilk@xxxxxxxxxx> wrote:
> > On Thu, Sep 25, 2014 at 04:04:46PM +0100, Jan Beulich wrote:
> >> >>> On 25.09.14 at 16:48, <konrad.wilk@xxxxxxxxxx> wrote:
> >> > On Thu, Sep 25, 2014 at 03:24:53PM +0100, Jan Beulich wrote:
> >> >> >>> On 23.09.14 at 04:10, <konrad.wilk@xxxxxxxxxx> wrote:
> >> >> > @@ -130,6 +146,7 @@ int pt_irq_create_bind(
> >> >> > return -ENOMEM;
> >> >> > }
> >> >> > pirq_dpci = pirq_dpci(info);
> >> >> > + pt_pirq_reset(d, pirq_dpci);
> >> >> >
> >> >> > switch ( pt_irq_bind->irq_type )
> >> >> > {
> >> >> > @@ -232,7 +249,6 @@ int pt_irq_create_bind(
> >> >> > {
> >> >> > unsigned int share;
> >> >> >
> >> >> > - pirq_dpci->dom = d;
> >> >> > if ( pt_irq_bind->irq_type == PT_IRQ_TYPE_MSI_TRANSLATE )
> >> >> > {
> >> >> > pirq_dpci->flags = HVM_IRQ_DPCI_MAPPED |
> >> >> > @@ -258,7 +274,6 @@ int pt_irq_create_bind(
> >> >> > {
> >> >> > if ( pt_irq_need_timer(pirq_dpci->flags) )
> >> >> > kill_timer(&pirq_dpci->timer);
> >> >> > - pirq_dpci->dom = NULL;
> >> >> > list_del(&girq->list);
> >> >> > list_del(&digl->list);
> >> >> > hvm_irq_dpci->link_cnt[link]--;
> >> >> > @@ -391,7 +406,6 @@ int pt_irq_destroy_bind(
> >> >> > msixtbl_pt_unregister(d, pirq);
> >> >> > if ( pt_irq_need_timer(pirq_dpci->flags) )
> >> >> > kill_timer(&pirq_dpci->timer);
> >> >> > - pirq_dpci->dom = NULL;
> >> >> > pirq_dpci->flags = 0;
> >> >> > pirq_cleanup_check(pirq, d);
> >> >> > }
> >> >>
> >> >> Is all of the above really necessary? I.e. I can neither see why setting
> >> >> ->dom earlier is needed, nor why clearing it on the error paths should
> >> >> be dropped.
> >> >
> >> > Yes. We need the ->dom so that the hvm_dirq_assist can run without
> >> > hitting an NULL pointer exception. Please keep in mind that the moment
> >> > we setup the IRQ action handler, we are "live" - [...]
> >>
> >> But all you need is that this happens before respective
> >> pirq_guest_bind() calls. I.e. in the PT_IRQ_TYPE_PCI and
> >> PT_IRQ_TYPE_MSI_TRANSLATE cases it was already done early enough
> >> (avoiding it remaining set on error paths), so all you'd need is adding
> >> it for the PT_IRQ_TYPE_MSI path too.
> >
> > .. and with the below statement ["pt_pirq_reset() is just replacing
> > that assigment"] I believe having just
> > pirq->dom = d;
> >
> > before the switch would be correct.
> >
> >>
> >> I agree that the clearing of the field in error paths might need a
> >> little care, but otoh you could equally well have hvm_dirq_assist()
> >> bail when it finds it to be NULL?
> >
> > Can't do it as is. I added in the 'get_knowalive_domain()' and
> > 'put_domain()'
> > in the 'schedule_softirq_for' and 'dpci_softirq' respectively.
> >
> > Which means that we would have an outstanding refcount as 'dpci_softirq'
> > could not access the '->dom' to get the domain and do proper refcounting.
> >
> > Unless I rip out the refcounting there which I believe is OK.
>
> No - instead the site clearing the field should then drop the
> reference (if so needed as indicated by other state). I.e.
> either there is a softirq being processed (in which case it's
> there where the reference gets dropped) or you can stop it
> from getting processed subsequently and drop the reference
> right away. Takes maybe another cmpxchg() afaict (on the
> ->dom field) or in the worst case another flag to signal this.
I've been looking at this and I've come up with something that does
address this - but instead of using 'cmpxchg' it ends up using
set_bit/clear_bit with two different states. Using 'cmpxchg' was
getting a bit too complex - or more likely I was getting too
fatigued of the code.
Anyhow here is an RFC on top of this patchset that I believe does
in spirit what you described. Naturally the whole big piece
of code that does the ref_counting on failure paths needs to be done
in an function. Please advise if this is acceptable for you
and if so I can properly flesh it out.
commit e0647b03be8319907d4f2dc424c1fcaa4895b31a
Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Sun Oct 5 09:44:14 2014 -0400
Two bits using clear/set
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 47c8ed5..cae61f1 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -29,6 +29,10 @@
static DEFINE_PER_CPU(struct list_head, dpci_list);
+enum {
+ STATE_SCHED, /* Bit 0 */
+ STATE_RUN,
+};
/*
* Should only be called from hvm_do_IRQ_dpci. We use the
* 'masked' as an gate to thwart multiple interrupts.
@@ -37,14 +41,14 @@ static DEFINE_PER_CPU(struct list_head, dpci_list);
* completed executing 'hvm_dirq_assist'.
*
*/
-static void schedule_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
+static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci)
{
unsigned long flags;
- if ( pirq_dpci->masked )
+ if ( test_and_set_bit(STATE_SCHED, &pirq_dpci->masked) )
return;
- pirq_dpci->masked = 1;
+ get_knownalive_domain(pirq_dpci->dom);
local_irq_save(flags);
list_add_tail(&pirq_dpci->softirq_list, &this_cpu(dpci_list));
@@ -55,9 +59,9 @@ static void schedule_softirq_for(struct hvm_pirq_dpci
*pirq_dpci)
/*
* If we are racing with softirq_dpci (masked is still set) we return
- * -EAGAIN. Otherwise we return 0.
+ * -ERESTART. Otherwise we return 0.
*
- * If it is -EAGAIN, it is the callers responsibility to make sure
+ * If it is -ERESTART, it is the callers responsibility to make sure
* that the softirq (with the event_lock dropped) has ran. We need
* to flush out the outstanding 'dpci_softirq' (no more of them
* will be added for this pirq as the IRQ action handler has been
@@ -65,10 +69,13 @@ static void schedule_softirq_for(struct hvm_pirq_dpci
*pirq_dpci)
*/
int pt_pirq_softirq_active(struct hvm_pirq_dpci *pirq_dpci)
{
- if ( pirq_dpci->masked )
- return -EAGAIN;
+ if ( test_bit(STATE_RUN, &pirq_dpci->masked) )
+ return -ERESTART;
+
+ if ( test_bit(STATE_SCHED, &pirq_dpci->masked) )
+ return -ERESTART;
/*
- * If in the future we would call 'schedule_softirq_for' right away
+ * If in the future we would call 'raise_softirq_for' right away
* after 'pt_pirq_softirq_active' we MUST reset the list (otherwise it
* might have stale data).
*/
@@ -143,7 +150,6 @@ int pt_irq_create_bind(
struct hvm_pirq_dpci *pirq_dpci;
struct pirq *info;
int rc, pirq = pt_irq_bind->machine_irq;
- s_time_t start = NOW();
if ( pirq < 0 || pirq >= d->nr_pirqs )
return -EINVAL;
@@ -188,8 +194,6 @@ int pt_irq_create_bind(
{
spin_unlock(&d->event_lock);
process_pending_softirqs();
- if ( ( NOW() - start ) > SECONDS(1) )
- return -EAGAIN;
goto restart;
}
/*
@@ -230,7 +234,35 @@ int pt_irq_create_bind(
{
pirq_dpci->gmsi.gflags = 0;
pirq_dpci->gmsi.gvec = 0;
- pirq_dpci->dom = NULL;
+
+ /* The softirq has saved it so we are safe to reset it. */
+ if ( test_bit(STATE_RUN, &pirq_dpci->masked) )
+ {
+ pirq_dpci->dom = NULL;
+ }
+ else if ( test_and_clear_bit(STATE_SCHED, &pirq_dpci->masked) )
+ {
+ /* pirq_guest_unbind has made sure we won't be getting
+ * any more interrupts (no raise_softirq_for), so the only
+ * ones that can be are the ones that got scheduled _right_
+ * before the pirq_guest_unbind. If we can de-schedule them
+ * that is good. The problem #1 is that the softirq might
be
+ * running and it has just set STATE_RUN while we cleared
+ * STATE_SCHED. That is OK, as right after the STATE_RUN it
+ * will check the STATE_SCHED and if cleared it will
unclear
+ * STATE_RUN and ignore this pirq. We MUST put the refcount
+ * down. */
+ put_domain(pirq_dpci->dom);
+ pirq_dpci->dom = NULL;
+ }
+ else
+ {
+ /* !STATE_RUN (stale) and !STATE_SCHED.
+ * softirq will ignore this pirq, but we MUST put the
refcount
+ * down. */
+ put_domain(pirq_dpci->dom);
+ pirq_dpci->dom = NULL;
+ }
pirq_dpci->flags = 0;
pirq_cleanup_check(info, d);
spin_unlock(&d->event_lock);
@@ -332,7 +364,7 @@ int pt_irq_create_bind(
{
if ( pt_irq_need_timer(pirq_dpci->flags) )
kill_timer(&pirq_dpci->timer);
- pirq_dpci->dom = NULL;
+ /* TODO - function. pirq_dpci->dom = NULL; */
list_del(&girq->list);
list_del(&digl->list);
hvm_irq_dpci->link_cnt[link]--;
@@ -538,7 +570,7 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq)
!(pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) )
return 0;
- schedule_softirq_for(pirq_dpci);
+ raise_softirq_for(pirq_dpci);
return 1;
}
@@ -592,11 +624,10 @@ void hvm_dpci_msi_eoi(struct domain *d, int vector)
spin_unlock(&d->event_lock);
}
-static void hvm_dirq_assist(struct hvm_pirq_dpci *pirq_dpci)
+static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
{
- struct domain *d = pirq_dpci->dom;
-
/*
+ * TODO: Remove
* We can be racing with 'pt_irq_destroy_bind' - with us being scheduled
* right before 'pirq_guest_unbind' gets called - but us not yet executed.
*
@@ -604,8 +635,6 @@ static void hvm_dirq_assist(struct hvm_pirq_dpci *pirq_dpci)
* 'mapping' - which is OK as later in this code we would
* do nothing except clear the ->masked field anyhow.
*/
- if ( !d )
- return;
ASSERT(d->arch.hvm_domain.irq.dpci);
@@ -725,11 +754,24 @@ static void dpci_softirq(void)
while ( !list_empty(&our_list) )
{
+ struct domain *d;
+
pirq_dpci = list_entry(our_list.next, struct hvm_pirq_dpci,
softirq_list);
list_del(&pirq_dpci->softirq_list);
- hvm_dirq_assist(pirq_dpci);
- pirq_dpci->masked = 0;
+ d = pirq_dpci->dom;
+ barrier(); /* 'd' MUST be saved before we set/clear the bits. */
+ if ( test_and_set_bit(STATE_RUN, &pirq_dpci->masked) )
+ BUG();
+ /* Asked to be descheduled. */
+ if ( !test_and_clear_bit(STATE_SCHED, &pirq_dpci->masked) )
+ {
+ clear_bit(STATE_RUN, &pirq_dpci->masked);
+ continue;
+ }
+ hvm_dirq_assist(d, pirq_dpci);
+ put_domain(d);
+ clear_bit(STATE_RUN, &pirq_dpci->masked);
}
}
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index d147189..1660750 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -775,7 +775,7 @@ static int pci_clean_dpci_irqs(struct domain *d)
struct hvm_irq_dpci *hvm_irq_dpci = NULL;
if ( !iommu_enabled )
- return -ESRCH;
+ return -ENODEV;
if ( !is_hvm_domain(d) )
return -EINVAL;
diff --git a/xen/include/xen/hvm/irq.h b/xen/include/xen/hvm/irq.h
index 4fc6dcf..4a9324e 100644
--- a/xen/include/xen/hvm/irq.h
+++ b/xen/include/xen/hvm/irq.h
@@ -93,7 +93,7 @@ struct hvm_irq_dpci {
/* Machine IRQ to guest device/intx mapping. */
struct hvm_pirq_dpci {
uint32_t flags;
- bool_t masked;
+ unsigned long masked;
uint16_t pending;
struct list_head digl_list;
struct domain *dom;
of 'latch'
>
> Jan
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |