Xen project Mailing List

Re: AMD EPYC virtual network performances

From: Elliott Mitchell <ehem+xen@xxxxxxx>

Date: Thu, 15 Aug 2024 13:08:19 -0700

Cc: Andrei Semenov <andrei.semenov@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Thu, 15 Aug 2024 20:08:49 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Aug 15, 2024 at 10:33:52AM +0200, Jürgen Groß wrote: > On 14.08.24 22:26, Elliott Mitchell wrote: > > On Wed, Aug 14, 2024 at 08:15:38AM +0200, Jürgen Groß wrote: > > > On 14.08.24 00:36, Elliott Mitchell wrote: > > > > > > > > Drat. I haven't noticed much, which would match with simply enabling a > > > > bunch of debugging printk()s (alas I'm not monitoring performance > > > > closely > > > > enough to be sure). Guess I wait for Andrei Semenov to state a comfort > > > > level with trying "iommu=debug". > > > > > > You didn't answer my question. > > > > I guess I did not explicitly do so. I was referring to the Xen > > command-line option. > > And again you didn't supply the information I asked for (command line > options of Xen and dom0). I had thought that was aimed at the initial reporter, Andrei Semenov. I've already supplied all options which could plausibly effect this. Xen: placeholder watchdog=true loglvl=info iommu=debug iommu=no-intremap cpuidle dom0_mem=... no-real-mode edd=off Linux; placeholder root=... ro concurrency=none vga=normal xen_pciback.hide=... net.ifname=1 nomodeset blacklist=... Somehow the others really don't seem likely to effect either issue as they don't effect interrupts (okay some drivers don't get loaded, but those drivers would render the problem irreproduceable). > Did you consider that asking for help while not supplying data which has > been asked for is going to result in no help at all? You are wasting the > time of volunteers, which will reduce the motivation to look into your > issue a lot. Sorry about being a bit inconsistent, but there is little I can do in the short-term about that. I would *strongly* prefer to keep information protected (PGP). Large commercial installations have a rather different risk/privacy threshold. It is also notable that I am not the sole reporter of any of these issues. My speculation was wrong, the "CPU*: No irq handler for vector..." messages are not strictly tied to the HVM domain. I think I'm simply reproducing what Andrei Semenov observed, HVM domains are more severely effected by this issue. I think I need to repeat an earlier observation of mine. It is NOT just vifs being effected by this, event channels associated with virtual block-devices are also seeing spurious interrupts though at a lower rate. What comes to mind is vifs would generate interrupts at a much greater rate than vbds. I kind of suspect a timing issue involving closely spaced interrupt. I wonder if there is a simple timing issue with the vif/vbd protocol. Problem is how would these turn into spurious interrupts observed by Xen? (I had suspected Qemu for the HVM domain, but that was disproven) -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg@xxxxxxx PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.