[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetBSD dom0 PVH: hardware interrupts stalls



On Wed, Nov 18, 2020 at 11:00:25AM +0100, Roger Pau Monné wrote:
> On Wed, Nov 18, 2020 at 10:24:25AM +0100, Manuel Bouyer wrote:
> > On Wed, Nov 18, 2020 at 09:57:38AM +0100, Roger Pau Monné wrote:
> > > On Tue, Nov 17, 2020 at 05:40:33PM +0100, Manuel Bouyer wrote:
> > > > On Tue, Nov 17, 2020 at 04:58:07PM +0100, Roger Pau Monné wrote:
> > > > > [...]
> > > > > 
> > > > > I have attached a patch below that will dump the vIO-APIC info as part
> > > > > of the 'i' debug key output, can you paste the whole output of the 'i'
> > > > > debug key when the system stalls?
> > > > 
> > > > see attached file. Note that the kernel did unstall while 'i' output was
> > > > being printed, so it is mixed with some NetBSD kernel output.
> > > > The idt entry of the 'ioapic2 pin2' interrupt is 103 on CPU 0.
> > > > 
> > > > I also put the whole sequence at
> > > > http://www-soc.lip6.fr/~bouyer/xen-log3.txt
> > > 
> > > On one of the instances the pin shows up as masked, but I'm not sure
> > > if that's relevant since later it shows up as unmasked. Might just be
> > > part of how NetBSD handles such interrupts.
> > 
> > Yes, NetBSD can mask an interrupt source if the interrupts needs to be 
> > delayed.
> > It will be unmasked once the interrupt has been handled.
> 
> Yes, I think that's roughly the same model that FreeBSD uses for
> level IO-APIC interrupts: mask it until the handlers have been run.
> 
> > Would it be possible that Xen misses an unmask write, or fails to
> > call the vector if the interrupt is again pending at the time of the
> > unmask ?
> 
> Well, it should work properly, but we cannot discard anything.

I did some more instrumentation from the NetBSD kernel, including dumping
the iopic2 pin2 register.

At the time of the command timeout, the register value is 0x0000a067,
which, if I understant it properly, menas that there's no interrupt
pending (bit IOAPIC_REDLO_RIRR, 0x00004000, is not set).
>From the NetBSD ddb, I can dump this register multiple times, waiting
several seconds, etc .., it doens't change).
Now if I call ioapic_dump_raw() from the debugger, which triggers some
XEN printf:
db{0}> call ioapic_dump_raw^M
Register dump of ioapic0^M
[ 203.5489060] 00 08000000 00170011 08000000(XEN) vioapic.c:124:d0v0 apic_mem_re
adl:undefined ioregsel 3
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 4
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 5
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 6
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 7
 00000000^M
[ 203.5489060] 08(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 8
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 9
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel a
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel b
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel c
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel d
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel e
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel f
 00000000^M
[ 203.5489060] 10 00010000 00000000 00010000 00000000 00010000 00000000 
00010000 00000000^M
[...]
[ 203.5489060] Register dump of ioapic2^M
[ 203.5489060] 00 0a000000 00070011 0a000000(XEN) vioapic.c:124:d0v0 
apic_mem_readl:undefined ioregsel 3
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 4
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 5
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 6
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 7
 00000000^M
[ 203.5489060] 08(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 8
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel 9
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel a
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel b
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel c
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel d
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel e
 00000000(XEN) vioapic.c:124:d0v0 apic_mem_readl:undefined ioregsel f
 00000000^M
[ 203.5489060] 10 00010000 00000000 00010000 00000000 0000e067 00000000 
00010000 00000000^M

then the register switches to 0000e067, with the IOAPIC_REDLO_RIRR bit set.
>From here, if I continue from ddb, the dom0 boots.

I can get the same effect by just doing ^A^A^A so my guess is that it's
not accessing the iopic's register which changes the IOAPIC_REDLO_RIRR bit,
but the XEN printf. Also, from NetBSD, using a dump fuinction which
doesn't access undefined registers - and so doesn't trigger XEN printfs -
doens't change the IOAPIC_REDLO_RIRR bit either.

-- 
Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>
     NetBSD: 26 ans d'experience feront toujours la difference
--



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.