[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/4] mitigate the per-pCPU blocking list may be too long



On Wed, Apr 26, 2017 at 05:39:57PM +0100, George Dunlap wrote:
>On 26/04/17 01:52, Chao Gao wrote:
>> VT-d PI introduces a per-pCPU blocking list to track the blocked vCPU
>> running on the pCPU. Theoretically, there are 32K domain on single
>> host, 128 vCPUs per domain. If all vCPUs are blocked on the same pCPU,
>> 4M vCPUs are in the same list. Travelling this issue consumes too
>> much time. We have discussed this issue in [1,2,3].
>> 
>> To mitigate this issue, we proposed the following two method [3]:
>> 1. Evenly distributing all the blocked vCPUs among all pCPUs.
>
>So you're not actually distributing the *vcpus* among the pcpus (which
>would imply some interaction with the scheduler); you're distributing
>the vcpu PI wake-up interrupt between pcpus.  Is that right?

Yes. I should describe things more clearly.

>
>Doesn't having a PI on a different pcpu than where the vcpu is running
>mean at least one IPI to wake up that vcpu?  If so, aren't we imposing a
>constant overhead on basically every single interrupt, as well as
>increasing the IPI traffic, in order to avoid a highly unlikely
>theoretical corner case?

If it will incur at least one more IPI, I can't agree more. I think it
depends on whether calling vcpu_unblock() to wake up a vCPU not running
on current pCPU will lead to a more IPI compared to the vCPU running
on the current pCPU. In my mind, different schedulers may differ on this point.

>
>A general maxim in OS development is "Make the common case fast, and the
>uncommon case correct."  It seems like it would be better in the common
>case to have the PI vectors on the pcpu on which the vcpu is running,
>and only implement the balancing when the list starts to get too long.

Agree. Distributing PI wakeup interrupt among the pcpus will increase
spurious interrupts, I think. Anyhow, I should take your advice. Kevin
also gave a similar advice in the discussion happened one year ago.

>
>Any chance you could trace how long the list traversal took?  It would
>be good for future reference to have an idea what kinds of timescales
>we're talking about.

Will do later.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.