Xen project Mailing List

Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness

To: Zoltan Kiss <zoltan.kiss@xxxxxxxxxx>

From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>

Date: Thu, 13 Mar 2014 10:04:59 +0000

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>

Delivery-date: Thu, 13 Mar 2014 10:05:22 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, 2014-03-12 at 21:10 +0000, Zoltan Kiss wrote: > On 12/03/14 17:43, Ian Campbell wrote: > > On Wed, 2014-03-12 at 17:14 +0000, Zoltan Kiss wrote: > >> On 12/03/14 15:37, Ian Campbell wrote: > >>> On Wed, 2014-03-12 at 15:14 +0000, Zoltan Kiss wrote: > >>>> On 12/03/14 14:30, Ian Campbell wrote: > >>>>> On Wed, 2014-03-12 at 14:27 +0000, Zoltan Kiss wrote: > >>>>>> On 12/03/14 10:28, Ian Campbell wrote: > >>>>>>> On Tue, 2014-03-11 at 23:24 +0000, Zoltan Kiss wrote: > >>>>>>>> On 11/03/14 15:44, Ian Campbell wrote: > >>>>>>> > >>>>>>>>> Is it the case that this macro considers a request to be unconsumed > >>>>>>>>> if > >>>>>>>>> the *response* to a request is outstanding as well as if the request > >>>>>>>>> itself is still on the ring? > >>>>>>>> I don't think that would make sense. I think everywhere where this > >>>>>>>> macro > >>>>>>>> is called the caller is not interested in pending request (pending > >>>>>>>> means > >>>>>>>> consumed but not responded) > >>>>>>> > >>>>>>> It might be interested in such pending requests in some of the > >>>>>>> pathological cases I allude to in the next paragraph though? > >>>>>>> > >>>>>>> For example if the ring has unconsumed requests but there are no slots > >>>>>>> free for a response, it would be better to treat it as no unconsumed > >>>>>>> requests until space opens up for a response, otherwise something else > >>>>>>> just has to abort the processing of the request when it notices the > >>>>>>> lack > >>>>>>> of space. > >>>>>>> > >>>>>>> (I'm totally speculating here BTW, I don't have any concrete idea why > >>>>>>> things are done this way...) > >>>>>>> > >>>>>>> > >>>>>>>>> I wonder if this apparently weird construction is due to > >>>>>>>>> pathological > >>>>>>>>> cases when one or the other end is not picking up > >>>>>>>>> requests/responses? > >>>>>>>>> i.e. trying to avoid deadlocking the ring or generating an interrupt > >>>>>>>>> storm when the ring it is full of one or the other or something > >>>>>>>>> along > >>>>>>>>> those lines? > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> Also, let me quote again my example about when rsp makes sense: > >>>>>> > >>>>>> "To clarify what does this do, let me show an example: > >>>>>> req_prod = 253 > >>>>>> req_cons = 256 > >>>>>> rsp_prod_pvt = 0 > >>>>> > >>>>> I think to make sense of this I need to see the sequence of reads/writes > >>>>> from both parties in a sensible ordering which would result in reads > >>>>> showing the above. i.e. a demonstration of the race not just an > >>>>> assertion that if the values are read as is things makes sense. > >>>> > >>>> Let me extend it: > >>>> > >>>> - callback reads req_prod = 253 > >>> > >>> callback == backend? Which context is this code running in? Which part > >>> of the system is the callback logically part of? > >> Yes, it is part of the backend, the function which handles when we can > >> release a slot back. With grant copy we don't have such thing, but with > >> mapping xenvif_zerocopy_callback does this (or in classic kernel, it had > >> a different name, but we called it page destructor). It can run from any > >> context, it depends on who calls kfree_skb. > > > > I think this is the root of the problem. The pv protocols really assume > > one entity on either end is moving/updating the ring pointers. If you > There is only _one_ entity moving/updating the ring pointers. Everyone > else is just reading it. The callback, xenvif_tx_interrupt, > xenvif_check_rx_xenvif. Perhaps I should have said "accessing" rather than "moving/updating". Even if only one entity is reading the ring if two entities are reading at least one of them is going to see inconsistencies. > > In any case, it seems like doing the poke from the callback is wrong and > > we should revert the patches which DaveM already applied and revisit > > this aspect of things, do you agree? > I've just sent in a patch to fix that. Thanks, I'll take a look. > I think the reason why I haven't > seen any issue is that the in this situation there are plenty of > outstanding packets, and all of their callback will schedule NAPI again. > Chances are quite small that the dealloc thread couldn't release enough > slots in the meantime. Seems plausible enough. That means that the issue would be seen rarely on quiet systems, which would be a nightmare to diagnose I think. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.