[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [RFC PATCH v1] Replace tasklets with per-cpu implementation.
Hey, With the Xen 4.5 feature freeze being right on the doorsteps I am not expecting this to go in as: 1) It touches core code, 2) It has never been tested on ARM, 3) It is RFC for right now. With those expectations out of the way, I am submitting for review an over-haul of the tasklet code. We had found one large machines with a small size of guests (12) that the steal time for idle guests was excessively high. Further debugging revealed that the global tasklet lock was taken across all sockets at an excessively high rate. To the point that 1/10th of a guest idle time was taken (and actually accounted for as in RUNNING state!). The ideal situation to reproduce this behavior is: 1). Allocate a twelve guests with one to four SR-IOV VFs. 2). Have half of them (six) heavily use the SR-IOV VFs devices. 3). Monitor the rest (which are in idle) and despair. As I discovered under the hood, we have two tasklets that are scheduled and executed quite often - the VIRQ_TIMER one: aassert_evtchn_irq_taskle, and the one in charge of injecting an PCI interrupt in the guest: hvm_do_IRQ_dpci. The 'hvm_do_IRQ_dpci' is the on that is most often scheduled and run. The performance bottleneck comes from the fact that we take the same spinlock three times: tasklet_schedule, when we are about to execute the tasklet, and when we are done executing the tasklet. This patchset throws away the global list and lock for all tasklets. Instead there are two per-cpu lists: one for softirq, and one run when scheduler decides it. There is also an global list and lock when we have cross-CPU tasklet scheduling - which thankfully rarely happens (microcode update and hypercall continuation). The insertion and removal from the list is done by disabling interrupts - which are short bursts of time. The behavior of the code to only execute one tasklet per iteration is also preserved (the Linux code would run through all of its tasklets). The performance benefit of this patch were astounding and removed the issues we saw. It also decreased the IRQ latency of delievering an interrupt to a guest. In terms of the patchset I choose an path in which: 0) The first patch fixes the performance bug we saw and it was easy to backport. 1) It is bisectable. 2) If something breaks it should be fairly easy to figure out which patch broke it. 3) It is spit up in a bit weird fashion with scaffolding code was added to keep it ticking (as at some point we have the old and the new implementation existing and used). And then later on removed. This is how Greg KH added kref and kobjects long time ago in the kernel and it had worked - so I figured I would borrow from this workflow. I would appreciate feedback from the maintainers if they would like this to be organized better. xen/common/tasklet.c | 305 +++++++++++++++++++++++++++++++++------------- xen/include/xen/tasklet.h | 52 +++++++- 2 files changed, 271 insertions(+), 86 deletions(-) Konrad Rzeszutek Wilk (5): tasklet: Introduce per-cpu tasklet for softirq. tasklet: Add cross CPU feeding of per-cpu tasklets. tasklet: Remove the old-softirq implementation. tasklet: Introduce per-cpu tasklet for schedule tasklet. tasklet: Remove the scaffolding. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |