[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 2/3] pirq_cleanup_check() leaks



On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
> Its original introduction had two issues: For one the "common" part of
> the checks (carried out in the macro) was inverted.

Is the current logic in evtchn_close() really malfunctioning?

pirq->evtchn = 0;
pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
    unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for HVM domains

It would seem to me the pirq_cleanup_check() call just after setting
evtchn = 0 was done to account for PV domains, while the second
(hidden) pirq_cleanup_check() call in unmap_domain_pirq_emuirq() would
do the cleanup for HVM domains.

Maybe there's something that I'm missing, I have to admit the PIRQ
logic is awfully complicated, even more when we mix the HVM PIRQ
stuff.

> And then after
> removal from the radix tree the structure wasn't scheduled for freeing.
> (All structures still left in the radix tree would be freed upon domain
> destruction, though.)

So if my understanding is correct, we didn't have a leak due to the
missing free_pirq_struct() because the inverted check in
pirq_cleanup_check() macro prevented the removal from the radix tree,
and so stale entries would be left there and freed at domain
destruction?

> For the freeing to be safe even if it didn't use RCU (i.e. to avoid use-
> after-free), re-arrange checks/operations in evtchn_close(), such that
> the pointer wouldn't be used anymore after calling pirq_cleanup_check()
> (noting that unmap_domain_pirq_emuirq() itself calls the function in the
> success case).
> 
> Fixes: c24536b636f2 ("replace d->nr_pirqs sized arrays with radix tree")
> Fixes: 79858fee307c ("xen: fix hvm_domain_use_pirq's behavior")

See the comment above about the call to pirq_cleanup_check() in
unmap_domain_pirq_emuirq(), which might imply 79858fee307c is not
wrong, just confusing.

> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> This is fallout from me looking into whether the function and macro of
> the same name could be suitably split, to please Misra rule 5.5. The
> idea was apparently that the check done in the macro is the "common"
> part, and the actual function would be per-architecture. Pretty clearly
> this, if need be, could also be achieved by naming the actual function
> e.g. arch_pirq_cleanup_check().
> 
> Despite my testing of this (to a certain degree), I'm wary of the
> change, since the code has been the way it was for about 12 years. It
> feels like I'm overlooking something crucial ...
> 
> The wrong check is also what explains why Arm got away without
> implementing the function (prior to "restrict concept of pIRQ to x86"):
> The compiler simply eliminated the two calls from event_channel.c.
> ---
> v3: New.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1349,6 +1349,7 @@ void (pirq_cleanup_check)(struct pirq *p
>  
>      if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
>          BUG();
> +    free_pirq_struct(pirq);
>  }
>  
>  /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
>              if ( !is_hvm_domain(d1) )
>                  pirq_guest_unbind(d1, pirq);
>              pirq->evtchn = 0;
> -            pirq_cleanup_check(pirq, d1);
> -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 
> )
> -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
> +            if ( !is_hvm_domain(d1) ||
> +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
> +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )

pirq_cleanup_check() already calls pirq_cleanup_check() itself.  Could
you please add a comment to note that unmap_domain_pirq_emuirq()
succeeding implies the call to pirq_cleanup_check() has already been
done?

Otherwise the logic here looks unbalanced by skipping the
pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.

> +                pirq_cleanup_check(pirq, d1);
>          }
>          unlink_pirq_port(chn1, d1->vcpu[chn1->notify_vcpu_id]);
>          break;
> --- a/xen/include/xen/irq.h
> +++ b/xen/include/xen/irq.h
> @@ -158,7 +158,7 @@ extern struct pirq *pirq_get_info(struct
>  void pirq_cleanup_check(struct pirq *, struct domain *);
>  
>  #define pirq_cleanup_check(pirq, d) \
> -    ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> +    (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)

Not that you need to fix it here, but why not place this check in
pirq_cleanup_check() itself?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.