[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] mpt3sas bug with Debian jessie kernel only under Xen - "swiotlb buffer is full"
Hi Andrew, On Sun, Dec 04, 2016 at 03:59:20PM +0000, Andrew Cooper wrote: > On 04/12/16 08:32, Andy Smith wrote: > > Under the Debian jessie amd64 kernel (linux-image-3.16.0-4-amd64 > > 3.16.36-1+deb8u2) running under Xen, I cannot put the system's > > storage under heavy load without receiving a bunch of "swiotlb > > buffer is full" kernel error messages and severely degraded > > performance. Sometimes the system panics and reboots itself. […] > Can you try these two patches from the XenServer Patch queue? > https://github.com/xenserver/linux-3.x.pg/blob/master/master/series#L613-L614 Looking good. Using those patches I'm ~20 minutes into this now: Every 2.0s: cat /proc/mdstat Tue Dec 6 02:16:40 2016 Personalities : [raid1] [raid10] md5 : active raid10 sdb[0] sda[1] 1875243008 blocks super 1.2 512K chunks 2 far-copies [2/2] [UU] [==>..................] check = 11.5% (217058176/1875243008) finish=133.9min speed=206252K/sec bitmap: 0/14 pages [0KB], 65536KB chunk md4 : active raid10 sdc[0] sdd[1] 3906886656 blocks super 1.2 512K chunks 2 far-copies [2/2] [UU] [>....................] check = 2.6% (102650880/3906886656) finish=674.4min speed=94007K/sec bitmap: 0/30 pages [0KB], 65536KB chunk …where previously it would have given kernel errors within 5 seconds, so I think that fixes it. I will have to perform some more strenuous testing. Those two patches did not apply cleanly to source of linux-image-3.16.0-4-amd64 3.16.36-1+deb8u2. The last bit of each patch was rejected, so I removed them and put them into a separate patch file (0003-fixup.patch attached). I have not done this process in a long time so just for the archives, my process was as per: https://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official # mkdir -p /data/debian # chown andy: /data/debian # apt-get install build-essential fakeroot # apt-get build-dep linux $ cd /data/debian $ apt-get source linux $ wget https://raw.githubusercontent.com/xenserver/linux-3.x.pg/master/master/0001-dma-add-dma_get_required_mask_from_max_pfn.patch $ wget https://raw.githubusercontent.com/xenserver/linux-3.x.pg/master/master/0002-x86-xen-correct-dma_get_required_mask-for-Xen-PV-gue.patch $ # remove last parts of each patch file, create 0003-fixup.patch that performs equivalent changes $ cd linux-3.16.36 $ # applying these patches is going to change symbols so changing the abiname $ # is necessary. $ # See https://kernel-handbook.alioth.debian.org/ch-versions.html#s-abi-name $ sed -i -e 's/^abiname: 4/abiname: 4bf/' debian/config/defines $ fakeroot debian/rules debian/control-real $ bash debian/bin/test-patches -f amd64 ../0001-dma-add-dma_get_required_mask_from_max_pfn.patch ../0002-x86-xen-correct-dma_get_required_mask-for-Xen-PV-gue.patch ../0003-fixup.patch # dpkg -i ../linux-headers-3.16.0-4bf-amd64_3.16.36-1+deb8u2a~test_amd64.deb ../linux-image-3.16.0-4bf-amd64_3.16.36-1+deb8u2a~test_amd64.deb boot into new kernel under Xen $ uname -a Linux elephant 3.16.0-4bf-amd64 #1 SMP Debian 3.16.36-1+deb8u2a~test (2016-12-05) x86_64 GNU/Linux I think my next steps should be: 1. Do some more strenuous testing 2. Report bug against source package "linux" in Debian jessie with pointer to those two patches. 3. Check if those fixes are already applied in Debian backports and/or Debian testing linux package. > > Dec 4 07:06:00 elephant kernel: [22019.373653] mpt3sas 0000:01:00.0: > > swiotlb buffer is full (sz: 57344 bytes) > > Dec 4 07:06:00 elephant kernel: [22019.374707] mpt3sas 0000:01:00.0: > > swiotlb buffer is full > > Dec 4 07:06:00 elephant kernel: [22019.375754] BUG: unable to handle > > kernel NULL pointer dereference at 0000000000000010 > > Dec 4 07:06:00 elephant kernel: [22019.376430] IP: [<ffffffffa004e779>] > > _base_build_sg_scmd_ieee+0x1f9/0x2d0 [mpt3sas] > > Dec 4 07:06:00 elephant kernel: [22019.377122] PGD 0 > > This alone is a clear error handling bug in the mpt3sas driver. It > hasn't checked the DMA mapping call for a successful mapping before > following the NULL pointer it got given back. It is collateral damage > from the swiotlb buffer being full, but a bug none the less. Does that require reporting as an upstream linux bug in mpt3sas then? Thanks for your help. Cheers, Andy Attachment:
0003-fixup.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |