[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs



On 06.03.2020 17:02, paul@xxxxxxx wrote:
> From: Varad Gautam <vrd@xxxxxxxxx>
> 
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
> In that scenario, it is possible to receive multiple __pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
>   domain_kill()
>     -> domain_relinquish_resources()
>       -> pci_release_devices()
>         -> pci_clean_dpci_irq()
>           -> pirq_guest_unbind()
>             -> __pirq_guest_unbind()
> 
> For a shared pirq (nr_guests > 1), the first call would zap the current
> domain from the pirq's guests[] list, but the action handler is never freed
> as there are other guests using this pirq. As a result, on the second call,
> __pirq_guest_unbind searches for the current domain which has been removed
> from the guests[] list, and hits a BUG_ON.
> 
> Make __pirq_guest_unbind safe to be called multiple times by letting xen
> continue if a shared pirq has already been unbound from this guest. The
> PIRQ will be cleaned up from the domain's pirq_tree during the destruction
> in complete_domain_destroy anyway.
> 
> Signed-off-by: Varad Gautam <vrd@xxxxxxxxx>
> [taking over from Varad at v4]
> Signed-off-by: Paul Durrant <paul@xxxxxxx>
> ---
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Julien Grall <julien@xxxxxxx>
> Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> 
> Roger suggested cleaning the entry from the domain pirq_tree so that
> we need not make it safe to re-call __pirq_guest_unbind(). This seems like
> a reasonable suggestion but the semantics of the code are almost
> impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
> the name of struct so you generally have little idea what it actally means)
> so I prefer to stick with a small fix that I can actually reason about.
> 
> v4:
>  - Re-work the guest array search to make it clearer

I.e. there are cosmetic differences to v3 (see below), but
technically it's still the same. I can't believe the re-use
of "pirq" for different entities is this big of a problem.
But anyway:

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1680,9 +1680,23 @@ static irq_guest_action_t *__pirq_guest_unbind(
>  
>      BUG_ON(!(desc->status & IRQ_GUEST));
>  
> -    for ( i = 0; (i < action->nr_guests) && (action->guest[i] != d); i++ )
> -        continue;
> -    BUG_ON(i == action->nr_guests);
> +    for ( i = 0; i < action->nr_guests; i++ )
> +        if ( action->guest[i] == d )
> +            break;
> +
> +    if ( i == action->nr_guests ) /* No matching entry */
> +    {
> +        /*
> +         * In case the pirq was shared, unbound for this domain in an earlier
> +         * call, but still existed on the domain's pirq_tree, we still reach
> +         * here if there are any later unbind calls on the same pirq. Return
> +         * if such an unbind happens.
> +         */
> +        ASSERT(action->shareable);
> +        return NULL;
> +    }
> +
> +    ASSERT(action->nr_guests > 0);

This seems pointless to have here - v3 had it inside the if(),
where it would actually guard against coming here with nr_guests
equal to zero. v3 also used if() and BUG() instead of ASSERT()
inside this if(), which to me would seem more in line with our
current ./CODING_STYLE guidelines of handling unexpected
conditions. Could you clarify why you switched things?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.