[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/netfront: Fix TX response spurious interrupts



On Mon, Jul 14, 2025 at 07:11:06AM +0000, Anthoine Bourgeois wrote:
> On Fri, Jul 11, 2025 at 05:33:43PM +0200, Juergen Gross wrote:
> >On 10.07.25 18:11, Anthoine Bourgeois wrote:
> >> We found at Vates that there are lot of spurious interrupts when
> >> benchmarking the PV drivers of Xen. This issue appeared with a patch
> >> that addresses security issue XSA-391 (see Fixes below). On an iperf
> >> benchmark, spurious interrupts can represent up to 50% of the
> >> interrupts.
> >>
> >> Spurious interrupts are interrupts that are rised for nothing, there is
> >> no work to do. This appends because the function that handles the
> >> interrupts ("xennet_tx_buf_gc") is also called at the end of the request
> >> path to garbage collect the responses received during the transmission
> >> load.
> >>
> >> The request path is doing the work that the interrupt handler should
> >> have done otherwise. This is particurary true when there is more than
> >> one vcpu and get worse linearly with the number of vcpu/queue.
> >>
> >> Moreover, this problem is amplifyed by the penalty imposed by a spurious
> >> interrupt. When an interrupt is found spurious the interrupt chip will
> >> delay the EOI to slowdown the backend. This delay will allow more
> >> responses to be handled by the request path and then there will be more
> >> chance the next interrupt will not find any work to do, creating a new
> >> spurious interrupt.
> >>
> >> This causes performance issue. The solution here is to remove the calls
> >> from the request path and let the interrupt handler do the processing of
> >> the responses. This approch removes spurious interrupts (<0.05%) and
> >> also has the benefit of freeing up cycles in the request path, allowing
> >> it to process more work, which improves performance compared to masking
> >> the spurious interrupt one way or another.
> >>
> >> Some vif throughput performance figures from a 8 vCPUs, 4GB of RAM HVM
> >> guest(s):
> >>
> >> Without this patch on the :
> >> vm -> dom0: 4.5Gb/s
> >> vm -> vm:   7.0Gb/s
> >>
> >> Without XSA-391 patch (revert of b27d47950e48):
> >> vm -> dom0: 8.3Gb/s
> >> vm -> vm:   8.7Gb/s
> >>
> >> With XSA-391 and this patch:
> >> vm -> dom0: 11.5Gb/s
> >> vm -> vm:   12.6Gb/s
> >>
> >> Fixes: b27d47950e48 ("xen/netfront: harden netfront against event channel 
> >> storms")
> >> Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@xxxxxxxxxx>
> >
> >Please resend this patch with the relevant maintainers added in the
> >recipients list.
> 
> Ok, I will resend the patch tomorrow.
> >
> >You can add my Reviewed-by: tag, of course.
> 
> Thanks!

Tested on a VM which this could be tried on.

Booting was successful, network appeared to function as it had been.
Spurious events continued to occur at roughly the same interval they had
been.

I can well believe this increases Xen network performance and may
reduce the occurrence of spurious interrupts, but it certainly doesn't
fully fix the problem(s).  Appears you're going to need to keep digging.

I believe this does count as Tested-by since I observed no new ill
effects.  Just the existing ill effects aren't fully solved.


-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sigmsg@xxxxxxx  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.