[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 118078: regressions - FAIL



> -----Original Message-----
[snip]
> What I meant was that I'd expect the guest to have interrupts disabled whilst
> poking the MSR to enable APIC assist on that CPU, since enabling APIC assist
> is clearly going to modify the way in which interrupts are handled. If that's
> not the case though then I guess that is probably the cause of the issue; I
> never really considered protecting interrupt handling against APIC assist
> being enabled on the same CPU.
> 

I think I've spotted it...

If the guest is indeed setting up APIC assist but not actually using it then I 
think we get into this situation...

- On return to guest vlapic_has_pending_irq() finds a bit set in the IRR, then 
vlapic_ack_pending_irq() calls viridian_start_apic_assist() which latches the 
vector, sets the bit in the ISR and clears it from the IRR.
- The guest then processes the interrupt but EOIs it normally, therefore 
clearing the bit in the ISR.
- On next return to guest vlapic_has_pending_irq() calls 
viridian_complete_apic_assist(), which discovers the assist bit still set in 
the page and therefore leaves the latched vector in place, but finds another 
bit set in the IRR.
- vlapic_ack_pending_irq() is then called, the ISR is clear, so another call is 
made to viridian_start_apic_assist() and this then calls domain_crash() because 
it finds the previously latched vector.

I think the correct solution to this is to call viridian_abort_apic_assist() in 
vlapic_EOI_set() when the vector is cleared from the ISR.

  Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.