[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads




> On Apr 7, 2020, at 5:50 PM, Julien Grall <julien@xxxxxxx> wrote:
> 
> 
> 
> On 07/04/2020 17:16, George Dunlap wrote:
>>> On Apr 6, 2020, at 7:47 PM, Julien Grall <julien@xxxxxxx> wrote:
>>> 
>>> On 06/04/2020 18:58, George Dunlap wrote:
>>>>> On Apr 3, 2020, at 9:27 PM, Julien Grall <julien.grall.oss@xxxxxxxxx> 
>>>>> wrote:
>>>>> 
>>>>> On Fri, 3 Apr 2020 at 20:41, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
>>>>> wrote:
>>>>>> 
>>>>>> On Thu, 2 Apr 2020, Julien Grall wrote:
>>>>>>> As we discussed on Tuesday, the cost for other vCPUs is only going to 
>>>>>>> be a
>>>>>>> trap to the hypervisor and then back again. The cost is likely smaller 
>>>>>>> than
>>>>>>> receiving and forwarding an interrupt.
>>>>>>> 
>>>>>>> You actually agreed on this analysis. So can you enlighten me as to why
>>>>>>> receiving an interrupt is a not problem for latency but this is?
>>>>>> 
>>>>>> My answer was that the difference is that an operating system can
>>>>>> disable interrupts, but it cannot disable receiving this special IPI.
>>>>> 
>>>>> An OS can *only* disable its own interrupts, yet interrupts will still
>>>>> be received by Xen even if the interrupts are masked at the processor
>>>>> (e.g local_irq_disable()).
>>>>> 
>>>>> You would need to disable interrupts one by one as the GIC level (use
>>>>> ICENABLER) in order to not receive any interrupts. Yet, Xen may still
>>>>> receive interrupts for operational purposes (e.g serial, console,
>>>>> maintainance IRQ...). So trap will happen.
>>>> I think Stefano’s assertion is that the users he has in mind will be 
>>>> configuring the system such that RT workloads get a minimum number of 
>>>> hypervisor-related interrupts possible.  On a 4-core system, you could  
>>>> have non-RT workloads running on cores 0-1, and RT workloads running with 
>>>> the NULL scheduler on cores 2-3.  In such a system, you’d obviously 
>>>> arrange that serial and maintenance IRQs are delivered to cores 0-1.
>>> Well maintenance IRQs are per pCPU so you can't route to another one...
>>> 
>>> But, I think you missed my point that local_irq_disable() from the guest 
>>> will not prevent the hypervisor to receive interrupts *even* the one routed 
>>> to the vCPU itself. They will just not be delivered to the guest context 
>>> until local_irq_enable() is called.
>> My understanding, from Stefano was that what his customers are concerned 
>> about is the time between the time a physical IRQ is delivered to the guest 
>> and the time the guest OS can respond appropriately.  The key thing here 
>> isn’t necessarily speed, but predictability — system designers need to know 
>> that, with a high probability, their interrupt routines will complete within 
>> X amount of cycles.
>> Further interrupts delivered to a guest are not a problem in this scenario, 
>> if the guest can disable them until the critical IRQ has been handled.
> 
> You keep saying a guest can disable interrupts, but it can't do it via 
> local_irq_disable(). So what method are you thinking? Disabling at the GIC 
> level? That is involving traps and most likely not going to help with 
> predictability...

So you’ll have to forgive me for making educated guesses here, as I’m trying to 
collect all the information.  On x86, if you use device pass-through on a 
system with a virtualized APIC and posted interrupts, then when when the device 
generates interrupts, those are delivered directly to the guest without 
involvement of Xen; and when the guest disables interrupts in the vAPIC, those 
interrupts will be disabled, and be delivered when the guest re-enables 
interrupts.  Given what Stefano said about disabling interrupts, I assumed that 
ARM had the same sort of functionality.  Is that not the case?

>> Xen-related IPIs, however, could potentially cause a problem if not 
>> mitigated. Consider a guest where vcpu 1 loops over the register, while vcpu 
>> 2 is handling a latency-critical IRQ.  A naive implementation might send an 
>> IPI every time vcpu 1 does a read, spamming vcpu 2 with dozens of IPIs.  
>> Then an IRQ routine which reliably finishes well within the required time 
>> normally suddenly overruns and causes an issue.
> 
> I never suggested the naive implementation would be perfect. That's why I 
> said it can be optimized...

It didn’t seem to me that you understood what Stefano’s concerns were; so I was 
trying to explain the situation he is trying to avoid (as well as making sure 
that I had a clear understanding myself).  The reason I said “a naive 
implementation” was to make clear that I knew that’s not what you were 
suggesting. :-)

>> I don’t know what maintenance IRQs are, but if they only happen 
>> intermittently, it’s possible that you’d never get more than a single one in 
>> a latency-critical IRQ routine; and as such, the variatibility in execution 
>> time (jitter) wouldn’t be an issue in practice.  But every time you add a 
>> new unblockable IPI, you increase this jitter; particularly if this 
>> unblockable IPI might be repeated an arbitrary number of times.
>> (Stefano, let me know if I’ve misunderstood something.)
>> So stepping back a moment, here’s all the possible ideas that I think have 
>> been discussed (or are there implicitly) so far.
>> 1. [Default] Do nothing; guests using this register continue crashing
>> 2. Make the I?ACTIVER registers RZWI.
>> 3. Make I?ACTIVER return the most recent known value; i.e. KVM’s current 
>> behavior (as far as we understand it)
>> 4. Use a simple IPI with do_noop to update I?ACTIVER
>> 4a.  Use an IPI, but come up with clever tricks to avoid interrupting guests 
>> handling IRQs.
>> 5. Trap to Xen on guest EOI, so that we know when the
>> 6. Some clever paravirtualized option
> 
> 7. Use an IPI if we are confident the interrupts may be active.

I don’t understand this one.  How is it different than 4 or 4a?  And in 
particular, how does it evaluate on the “how much additional design work would 
it take” criteria?

 -George


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.