[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
>>> On 08.03.16 at 15:42, <George.Dunlap@xxxxxxxxxxxxx> wrote: > On Tue, Mar 8, 2016 at 1:10 PM, Wu, Feng <feng.wu@xxxxxxxxx> wrote: >>> -----Original Message----- >>> From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx] > [snip] >>> It seems like there are a couple of ways we could approach this: >>> >>> 1. Try to optimize the reverse look-up code so that it's not a linear >>> linked list (getting rid of the theoretical fear) >> >> Good point. >> >>> >>> 2. Try to test engineered situations where we expect this to be a >>> problem, to see how big of a problem it is (proving the theory to be >>> accurate or inaccurate in this case) >> >> Maybe we can run a SMP guest with all the vcpus pinned to a dedicated >> pCPU, we can run some benchmark in the guest with VT-d PI and without >> VT-d PI, then see the performance difference between these two sceanrios. > > This would give us an idea what the worst-case scenario would be. How would a single VM ever give us an idea about the worst case? Something getting close to worst case is a ton of single vCPU guests all temporarily pinned to one and the same pCPU (could be multi-vCPU ones, but the more vCPU-s the more artificial this pinning would become) right before they go into blocked state (i.e. through one of the two callers of arch_vcpu_block()), the pinning removed while blocked, and then all getting woken at once. > But > pinning all vcpus to a single pcpu isn't really a sensible use case we > want to support -- if you have to do something stupid to get a > performance regression, then I as far as I'm concerned it's not a > problem. > > Or to put it a different way: If we pin 10 vcpus to a single pcpu and > then pound them all with posted interrupts, and there is *no* > significant performance regression, then that will conclusively prove > that the theoretical performance regression is of no concern, and we > can enable PI by default. The point isn't the pinning. The point is what pCPU they're on when going to sleep. And that could involve quite a few more than just 10 vCPU-s, provided they all sleep long enough. And the "theoretical performance regression is of no concern" is also not a proper way of looking at it, I would say: Even if such a situation would happen extremely rarely, if it can happen at all, it would still be a security issue. > On the other hand, if we pin 10 vcpus to a single pcpu, pound them all > with posted interrupts, and then there *is* a significant performance > regression, then it would still not convince me there is a real > problem to be solved. There is only actually a problem if the "long > chain of vcpus" can happen in the course of a semi-realistic use-case. > > Suppose we had a set of SRIOV NICs with 10-20 virtual functions total, > assigned to 10-20 VMs, and those VMs in a cpupool confined to a single > socket of about 4 cores; and then we do a really network-intensive > benchmark. That's a *bit* far-fetched, but it's something that might > conceivably happen in the real world without any deliberate stupidity. > If there's no significant performance issues in that case, I would > think we can say that posted interrupts are robust enough to be > enabled by default. > >>> 3. Turn the feature on by default as soon as the 4.8 window opens up, >>> perhaps with some sort of a check that runs when in debug mode that >>> looks for the condition we're afraid of happening and BUG()s. If we run >>> a full development cycle without anyone hitting the bug in testing, then >>> we just leave the feature on. >> >> Maybe we can pre-define a max acceptable length of the list, if it really >> reach the number, print out a warning or something like that. However, >> how to decide the max length is a problem. May need more thinking. > > I think we want to measure the amount of time spent in the interrupt > handler (or with interrupts disabled). It doesn't matter if the list > is 100 items long, if it can be handled in 500us. On the other hand, > if a list of 4 elements takes 20ms, there's a pretty massive problem. > :-) Spending on the order of 500us in an interrupt handler would already seem pretty long to me, especially when the interrupt may get raised at a high frequency. Even more so if, when in that state, _each_ invocation of the interrupt handler would take that long: With an (imo not unrealistic) interrupt rate of 1kHz we would spend half of the available CPU time in that handler. > I don't have a good idea what an unreasonably large number would be here -- > Jan? Neither do I, unfortunately. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |