Xen project Mailing List

Re: [Xen-devel] Interesting observation with network event notification and batching

To: Andrew Bennieston <andrew.bennieston@xxxxxxxxxx>

From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>

Date: Mon, 17 Jun 2013 12:08:17 +0100

Cc: annie.li@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx, Wei Liu <wei.liu2@xxxxxxxxxx>, stefano.stabellini@xxxxxxxxxxxxx, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Delivery-date: Mon, 17 Jun 2013 11:08:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, 2013-06-17 at 11:56 +0100, Andrew Bennieston wrote: > On 17/06/13 11:46, Wei Liu wrote: > > On Mon, Jun 17, 2013 at 10:56:12AM +0100, Andrew Bennieston wrote: > >> On 17/06/13 10:38, Ian Campbell wrote: > >>> On Sun, 2013-06-16 at 10:54 +0100, Wei Liu wrote: > >>>>>> Konrad, IIRC you once mentioned you discovered something with event > >>>>>> notification, what's that? > >>>>> > >>>>> They were bizzare. I naively expected some form of # of physical NIC > >>>>> interrupts to be around the same as the VIF or less. And I figured > >>>>> that the amount of interrupts would be constant irregardless of the > >>>>> size of the packets. In other words #packets == #interrupts. > >>>>> > >>>> > >>>> It could be that the frontend notifies the backend for every packet it > >>>> sends. This is not desirable and I don't expect the ring to behave that > >>>> way. > >> > >> I have observed this kind of behaviour during network performance > >> tests in which I periodically checked the ring state during an iperf > >> session. It looked to me like the frontend was sending notifications > >> far too often, but that the backend was sending them very > >> infrequently, so the Tx (from guest) ring was mostly empty and the > >> Rx (to guest) ring was mostly full. This has the effect of both > >> front and backend having to block occasionally waiting for the other > >> end to clear or fill a ring, even though there is more data > >> available. > >> > >> My initial theory was that this was caused in part by the shared > >> event channel, however I expect that Wei is testing on top of a > >> kernel with his split event channel features? > >> > > > > Yes, with split event channels. > > > > And during tests the interrupt counts, frontend TX has 6 figures > > interrupt number while frontend RX has 2 figures number. > > > >>> > >>> It is probably worth checking that things are working how we think they > >>> should. i.e. that netback's calls to RING_FINAL_CHECK_FOR_.. and > >>> netfront's calls to RING_PUSH_..._AND_CHECK_NOTIFY are placed at > >>> suitable points to maximise batching. > >>> > >>> Is the RING_FINAL_CHECK_FOR_REQUESTS inside the xen_netbk_tx_build_gops > >>> loop right? This would push the req_event pointer to just after the last > >>> request, meaning the net request enqueued by the frontend would cause a > >>> notification -- even though the backend is actually still continuing to > >>> process requests and would have picked up that packet without further > >>> notification. n this case there is a fair bit of work left in the > >>> backend for this iteration i.e. plenty of opportunity for the frontend > >>> to queue more requests. > >>> > >>> The comments in ring.h say: > >>> * These macros will set the req_event/rsp_event field to trigger a > >>> * notification on the very next message that is enqueued. If you want > >>> to > >>> * create batches of work (i.e., only receive a notification after > >>> several > >>> * messages have been enqueued) then you will need to create a > >>> customised > >>> * version of the FINAL_CHECK macro in your own code, which sets the > >>> event > >>> * field appropriately. > >>> > >>> Perhaps we want to just use RING_HAS_UNCONSUMED_REQUESTS in that loop > >>> (and other similar loops) and add a FINAL check at the very end? > >>> > >>>>> But it was odd and I didn't go deeper in it to figure out what > >>>>> was happening. And also to figure out if for the VIF we could > >>>>> do something of #packets != #interrupts. And hopefully some > >>>>> mechanism to adjust so that the amount of interrupts would > >>>>> be lesser per packets (hand waving here). > >>>> > >>>> I'm trying to do this now. > >>> > >>> What scheme do you have in mind? > >> > >> As I mentioned above, filling a ring completely appears to be almost > >> as bad as sending too many notifications. The ideal scheme may > >> involve trying to balance the ring at some "half-full" state, > >> depending on the capacity for the front- and backends to process > >> requests and responses. > >> > > > > I don't think filling the ring full causes any problem, that's just > > conceptually the same as "half-full" state if you need to throttle the > > ring. > My understanding was that filling the ring will cause the producer to > sleep until slots become available (i.e. the until the consumer notifies > it that it has removed something from the ring). > > I'm just concerned that overly aggressive batching may lead to a > situation where the consumer is sitting idle, waiting for a notification > that the producer hasn't yet sent because it can still fill more slots > on the ring. When the ring is completely full, the producer would have > to wait for the ring to partially empty. At this point, the consumer > would hold off notifying because it can still batch more processing, so > the producer is left waiting. (Repeat as required). It would be better > to have both producer and consumer running concurrently. > > I mention this mainly so that we don't end up with a swing to the polar > opposite of what we have now, which (to my mind) is just as bad. Clearly > this is an edge case, but if there's a reason I'm missing that this > can't happen (e.g. after a period of inactivity) then don't hesitate to > point it out :) Doesn't the separation between req_event and rsp_event help here? So if the producer fills the ring, it will sleep, but set rsp_event appropriately that when the backend completes some (but not all) work it will be woken up so that it can put extra stuff on the ring. It shouldn't need to wait for the backend to process the whole batch for this. > > (Perhaps "half-full" was misleading... the optimal state may be "just > enough room for one more packet", or something along those lines...) > > Andrew > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.