[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1



On 22.01.2021 10:53, Roger Pau Monné wrote:
> On Fri, Jan 22, 2021 at 10:02:15AM +0100, Jan Beulich wrote:
>> On 15.01.2021 15:28, Roger Pau Monne wrote:
>>> When pins are cleared from either ISR or IRR as part of the
>>> initialization sequence forward the clearing of those pins to the dpci
>>> EOI handler, as it is equivalent to an EOI. Not doing so can bring the
>>> interrupt controller state out of sync with the dpci handling logic,
>>> that expects a notification when a pin has been EOI'ed.
>>
>> The question though is what this clearing of ISR and some of
>> IRR during ICW1 is based upon: Going through various manuals
>> (up-to-date from Nov 2020, intermediate, and all the way
>> through to an old hard copy from 1993) I can't find a single
>> mention of ICW1 having any effect on ISR or IRR, despite both
>> soft copy ones being detailed about other state getting
>> changed.
>>
>> There is "Following initialization, an interrupt request (IRQ)
>> input must make a low-to-high transition to generate an
>> interrupt", but I'm afraid this can be read to mean different
>> things. And if it was meant to describe an effect on ISR
>> and/or IRR, it would imo be in conflict with us keeping IRR
>> bits of level triggered interrupts.
> 
> I have an old copy of the 8259A spec, and it does mention the same
> quote that you have above. I also wondered while I was adjusting this
> code whether what we do is fine. I have to admit I haven't considered
> changing this logic much because I don't have an effective way to test
> it.
> 
> I've also taken a look at what others do, QEMU for example will do
> exactly the same logic as Xen during ICW1, bhyve OTOH will fully zero
> IRR and leave ISR as is.
> 
> Doing a bit of archaeology in QEMU I've found the following change:
> 
> commit aa24822bdc7c4e74afbc6fa1324b01cf067da7cb
> Author: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
> Date:   Tue Jan 24 16:29:29 2012 +0100
> 
>     i8259: Do not clear level-triggered lines in IRR on init
> 
>     When an input line is handled as level-triggered, it will immediately
>     raise an IRQ on the output of a PIC again that goes through an init
>     reset. So only clear the edge-triggered inputs from IRR in that
>     scenario.
> 
>     Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
>     Signed-off-by: Anthony Liguori <aliguori@xxxxxxxxxx>
> 
> Which seems to point out there's a reasoning behind what it's
> currently implemented. This seems to be against the spec as there's no
> low-to-high transition?

Would seem so to me, yes. Knowing how hard it was back at the time
to find any doc on the 8259A at all, and knowing that the doc that
initially became available wasn't very complete, I could very well
imagine the present doc still being incomplete. Let me see if I
can get something out of Intel.

>>> @@ -217,6 +219,24 @@ static void vpic_ioport_write(
>>>              }
>>>  
>>>              vpic->init_state = ((val & 3) << 2) | 1;
>>> +            vpic_update_int_output(vpic);
>>> +            vpic_unlock(vpic);
>>> +
>>> +            /*
>>> +             * Forward the EOI of any pending or in service interrupt that 
>>> has
>>> +             * been cleared from IRR or ISR, or else the dpci logic will 
>>> get
>>> +             * out of sync with the state of the interrupt controller.
>>> +             */
>>> +            while ( pending )
>>> +            {
>>> +                unsigned int pin = __scanbit(pending, 8);
>>> +
>>> +                ASSERT(pin < 8);
>>> +                hvm_dpci_eoi(current->domain,
>>> +                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : 
>>> pin));
>>> +                __clear_bit(pin, &pending);
>>> +            }
>>> +            goto unmask;
>>
>> A similar consideration applies here (albeit just as an
>> observation, as being orthogonal to your change): A PIC
>> becomes ready for handling interrupts only at the end of the
>> ICWx sequence. Hence it would seem to me that invoking
>> pt_may_unmask_irq()
> 
> Right, it might be best to force unmask = 1 when init_state gets set
> to 0. A spurious call to pt_may_unmask_irq won't be harmful anyway,
> even if no pins have been actually unmasked.

Right, that's what I was considering too. (Really I was
thinking to propagate the "unmasked" value from ICW1
processing to when init_state gets cleared, but you're
right about a stray call to pt_may_unmask_irq() not doing
any harm.)

>> (maybe also vpic_update_int_output()) at
>> ICW1 is premature. To me this seems particularly relevant
>> considering that ICW1 clears IMR, and hence an interrupt
>> becoming pending between ICW1 and ICW2 wouldn't know which
>> vector to use.
>>
>> Or maybe on that side of things, vpic_update_int_output()
>> should really do
>>
>>     if ( vpic->int_output == (!vpic->init_state && irq >= 0) )
>>         return;
>>
>>     /* INT line transition L->H or H->L. */
>>     vpic->int_output = !vpic->init_state && !vpic->int_output;
>>
>> ?
> 
> So to force int_output = false when in init state. Seems reasonable, I
> can implement this either as a pre-patch or a followup, but again I'm
> not sure I have means to properly test it.

That testing consideration goes as well for the correctness
of what we have right now. It functions for the very limited
set of modes typical OSes use, and likely nobody really
knows whether it would also function correctly when used in
"exotic" ways.

As to pre-patch or follow-up: For this latter aspect it
probably doesn't matter much. If any adjustment is made
affecting the code further up, it may want to be a pre-patch
though, as it may change how you want to exit (e.g. I'd
expect the "goto" to possibly go away).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.