[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
On 28/09/2018 00:03, Sander Eikelenboom wrote: > On 27/09/18 23:48, Boris Ostrovsky wrote: >> On 9/27/18 5:37 PM, Jens Axboe wrote: >>> On 9/27/18 2:33 PM, Sander Eikelenboom wrote: >>>> On 27/09/18 21:06, Boris Ostrovsky wrote: >>>>> On 9/27/18 2:56 PM, Jens Axboe wrote: >>>>>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote: >>>>>>> On 27/09/18 16:26, Jens Axboe wrote: >>>>>>>> On 9/27/18 1:12 AM, Juergen Gross wrote: >>>>>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote: >>>>>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >>>>>>>>>> added support for purging persistent grants when they are not in >>>>>>>>>> use. As >>>>>>>>>> part of the purge, the grants were removed from the grant buffer, >>>>>>>>>> This >>>>>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered >>>>>>>>>> in >>>>>>>>>> get_free_grant(). This can be observed even on an idle system, within >>>>>>>>>> 20-30 minutes. >>>>>>>>>> >>>>>>>>>> We should keep the grants in the buffer when purging, and only free >>>>>>>>>> the >>>>>>>>>> grant ref. >>>>>>>>>> >>>>>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants") >>>>>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> >>>>>>>>> Reviewed-by: Juergen Gross <jgross@xxxxxxxx> >>>>>>>> Since Konrad is out, I'm going to queue this up for 4.19. >>>>>>>> >>>>>>> Hi Boris/Juergen. >>>>>>> >>>>>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch >>>>>>> from Boris pulled on top. >>>>>>> Unfortunately it made a VM hang (probably because it's rootFS is >>>>>>> shuffled from under it's feet >>>>> What do you mean by "rootFS is shuffled from under it's feet " ? >>>> Assumption that block-front getting borked and either a kernel crash or >>>> rootfs becoming mounted readonly. Didn't (try) to check though. >>>> >>>>>>> and it gave these in dom0 dmesg: >>>>>>> >>>>>>> [ 9251.696090] xen-blkback: requesting a grant already in use >>>>>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the >>>>>>> tree >>>>>>> [ 9251.715781] xen-blkback: requesting a grant already in use >>>>>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the >>>>>>> tree >>>>>>> [ 9251.735698] xen-blkback: requesting a grant already in use >>>>>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the >>>>>>> tree >>>>>>> >>>>>>> The VM was a HVM with 4 vcpu's and 2 phy disks: >>>>>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 >>>>>>> (x86_64-abi) persistent grants >>>>>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 >>>>>>> (x86_64-abi) persistent grants >>>>>>> >>>>>>> >>>>>>> Currently i have been running 4.19-rc5 with xen-next on top and commit >>>>>>> a46b53672b2c reverted, for a couple of days. That seems to run stable >>>>>>> for me (since it's a small box so i'm not hit by what a46b53672b2c >>>>>>> tried to fix. >>>>>>> >>>>>>> If you can come up with a debug patch i can give that a spin tomorrow >>>>>>> evening or in the weekend, so we are hopefully still in time for the >>>>>>> 4.19 release. >>>>>> At this late in the game, might make more sense to simply revert the >>>>>> buggy commit. Especially since what is currently out there doesn't fix >>>>>> the issue for you. >>>> Don't know if Boris or Juergen have a hunch about the issue, if not >>>> perhaps a revert is the best. >>> Anyone? Unless I hear otherwise, I'll revert the series tomorrow. >> >> Juergen may have something to say by tomorrow, but from my perspective, >> given that we are coming up on rc6 --- yes. >> >> I looked at the patches again and didn't see anything obvious. >> >> -boris > > Could also be that what i hit is a latent bug, > that is not caused by these patches but merely got uncovered by them. > > xl dmesg also shows quite some: > (XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 > grant table from 19 to 20 frames > (XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 > grant table from 20 to 21 frames > (and has done that for ages on my box not leading to any direct problems to > my knowledge) > > I don't know if there could be related and something around the (persistent) > grants for block devices could be leaking under some conditions? I could reproduce the issue Boris has seen and I have found the fault in his patch. Just testing a fix. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |