[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling

On Tue, Mar 8, 2016 at 1:10 PM, Wu, Feng <feng.wu@xxxxxxxxx> wrote:
>> -----Original Message-----
>> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx]
>> It seems like there are a couple of ways we could approach this:
>> 1. Try to optimize the reverse look-up code so that it's not a linear
>> linked list (getting rid of the theoretical fear)
> Good point.
>> 2. Try to test engineered situations where we expect this to be a
>> problem, to see how big of a problem it is (proving the theory to be
>> accurate or inaccurate in this case)
> Maybe we can run a SMP guest with all the vcpus pinned to a dedicated
> pCPU, we can run some benchmark in the guest with VT-d PI and without
> VT-d PI, then see the performance difference between these two sceanrios.

This would give us an idea what the worst-case scenario would be.  But
pinning all vcpus to a single pcpu isn't really a sensible use case we
want to support -- if you have to do something stupid to get a
performance regression, then I as far as I'm concerned it's not a

Or to put it a different way: If we pin 10 vcpus to a single pcpu and
then pound them all with posted interrupts, and there is *no*
significant performance regression, then that will conclusively prove
that the theoretical performance regression is of no concern, and we
can enable PI by default.

On the other hand, if we pin 10 vcpus to a single pcpu, pound them all
with posted interrupts, and then there *is* a significant performance
regression, then it would still not convince me there is a real
problem to be solved.  There is only actually a problem if the "long
chain of vcpus" can happen in the course of a semi-realistic use-case.

Suppose we had a set of SRIOV NICs with 10-20 virtual functions total,
assigned to 10-20 VMs, and those VMs in a cpupool confined to a single
socket of about 4 cores; and then we do a really network-intensive
benchmark. That's a *bit* far-fetched, but it's something that might
conceivably happen in the real world without any deliberate stupidity.
If there's no significant performance issues in that case, I would
think we can say that posted interrupts are robust enough to be
enabled by default.

>> 3. Turn the feature on by default as soon as the 4.8 window opens up,
>> perhaps with some sort of a check that runs when in debug mode that
>> looks for the condition we're afraid of happening and BUG()s.  If we run
>> a full development cycle without anyone hitting the bug in testing, then
>> we just leave the feature on.
> Maybe we can pre-define a max acceptable length of the list,  if it really
> reach the number, print out a warning or something like that. However,
> how to decide the max length is a problem. May need more thinking.

I think we want to measure the amount of time spent in the interrupt
handler (or with interrupts disabled).  It doesn't matter if the list
is 100 items long, if it can be handled in 500us.  On the other hand,
if a list of 4 elements takes 20ms, there's a pretty massive problem.

I don't have a good idea what an unreasonably large number would be here -- Jan?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.