[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Serious AMD-Vi(?) issue



On Thu, Jun 27, 2024 at 05:18:15PM -0700, Elliott Mitchell wrote:
> I'm rather surprised it was so long before the next system restart.  
> Seems a quiet period as far as security updates go.  Good news is I made
> several new observations, but I don't know how valuable these are.
> 
> On Mon, May 13, 2024 at 10:44:59AM +0200, Roger Pau Monné wrote:
> > 
> > Does booting with `iommu=no-intremap` lead to any issues being
> > reported?
> 
> On boot there was in fact less.  Notably the "AMD-Vi" messages haven't
> shown up at all.  I haven't stressed it very much yet, but previous
> boots a message showed up the moment the MD-RAID1 driver was loaded.
> 
> 
> I am though seeing two different messages now:
> 
> (XEN) CPU#: No irq handler for vector # (IRQ -#, LAPIC)
> (XEN) IRQ# a=#[#,#] v=#[#] t=PCI-MSI s=#
> 
> These are to be appearing in pairs.  Multiple values show for each field,
> though each field appears to vary between 2-3 different values.  There
> are thousands of these messages showing up.

Some lucky timing so I've done some more experimentation and sampling.

The "(XEN) IRQ" line almost always shows up with the "(XEN) CPU" line.
I notice it is possible to generate the first without the second, so this
seems notable.  Every single "(XEN) CPU" line mentioned "LAPIC".

The small number (20) of lines where "(XEN) IRQ" did not show up, the
"(XEN) CPU" line always ended with "(IRQ -2147483648, LAPIC)"

For the "t=" value out of 316 samples, 94 listed "PCI-MSI" while 222
listed "PCI-MSI/-X".

For the IRQ, 72 occurred 126 times.  71, 73 and 108 occurred roughly 50
times each. 109 and 111 occurred under 10 times.  Almost no other IRQ
values appeared.

The "s=" value was "00000030" slightly more often than "00000010".  No
other values have been observed so far.

The other values were didn't show too many patterns.

Most processors were mentioned roughly equally.  Several had fewer
mentions, but not enough to seem significant.  I discovered processor 1
did NOT show up.  Whereas processor 0 had an above average number of
occurrences.  This seems notable as these 2 processors are both reserved
exclusively for domain 0.

There have also been a few "spurious 8259A interrupt" lines.  So far
there haven't been very many of these.  The processor and IRQ listed
don't yet appear to show any patterns.  So far no IRQ has been listed
twice.


-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         ehem+sigmsg@xxxxxxx  PGP 87145445         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.