[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Troubles running Xen on Raspberry Pi 4 with 5.6.1 DomU



On Mon, May 4, 2020 at 8:52 PM Stefano Stabellini
<sstabellini@xxxxxxxxxx> wrote:
>
> On Mon, 4 May 2020, Roman Shaposhnik wrote:
> > Hi Julien,
> >
> > thank for your patch -- just like Corey I tried it out and it seems to
> > work fine and gets
> > me further. At this point, I'm pretty sure I'm past initial
> > bootstrapping issues and into
> > what can be basically described as Xen DMA issue of some kind (so I'm
> > pretty sure
> > I will need Stefano's help to debug this further). I'm attaching
> > verbose logs, but the
> > culprit seems to be:
> >
> > [    2.534292] Unable to handle kernel paging request at virtual
> > address 000000000026c340
> > [    2.542373] Mem abort info:
> > [    2.545257]   ESR = 0x96000004
> > [    2.548421]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [    2.553877]   SET = 0, FnV = 0
> > [    2.557023]   EA = 0, S1PTW = 0
> > [    2.560297] Data abort info:
> > [    2.563258]   ISV = 0, ISS = 0x00000004
> > [    2.567208]   CM = 0, WnR = 0
> > [    2.570294] [000000000026c340] user address but active_mm is swapper
> > [    2.576783] Internal error: Oops: 96000004 [#1] SMP
> > [    2.581784] Modules linked in:
> > [    2.584950] CPU: 3 PID: 135 Comm: kworker/3:1 Not tainted 5.6.1-default 
> > #9
> > [    2.591970] Hardware name: Raspberry Pi 4 Model B (DT)
> > [    2.597256] Workqueue: events deferred_probe_work_func
> > [    2.602509] pstate: 60000005 (nZCv daif -PAN -UAO)
> > [    2.607431] pc : xen_swiotlb_free_coherent+0x198/0x1d8
> > [    2.612696] lr : dma_free_attrs+0x98/0xd0
> > [    2.616827] sp : ffff800011db3970
> > [    2.620242] x29: ffff800011db3970 x28: 00000000000f7b00
> > [    2.625695] x27: 0000000000001000 x26: ffff000037b68410
> > [    2.631129] x25: 0000000000000001 x24: 00000000f7b00000
> > [    2.636583] x23: 0000000000000000 x22: 0000000000000000
> > [    2.642017] x21: ffff800011b0d000 x20: ffff80001179b548
> > [    2.647461] x19: ffff000037b68410 x18: 0000000000000010
> > [    2.652905] x17: ffff000035d66a00 x16: 00000000deadbeef
> > [    2.658348] x15: ffffffffffffffff x14: ffff80001179b548
> > [    2.663792] x13: ffff800091db37b7 x12: ffff800011db37bf
> > [    2.669236] x11: ffff8000117c7000 x10: ffff800011db3740
> > [    2.674680] x9 : 00000000ffffffd0 x8 : ffff80001071e980
> > [    2.680124] x7 : 0000000000000132 x6 : ffff80001197a6ab
> > [    2.685568] x5 : 0000000000000000 x4 : 0000000000000000
> > [    2.691012] x3 : 00000000f7b00000 x2 : fffffdffffe00000
> > [    2.696465] x1 : 000000000026c340 x0 : 000002000046c340
> > [    2.701899] Call trace:
> > [    2.704452]  xen_swiotlb_free_coherent+0x198/0x1d8
> > [    2.709367]  dma_free_attrs+0x98/0xd0
> > [    2.713143]  rpi_firmware_property_list+0x1e4/0x240
> > [    2.718146]  rpi_firmware_property+0x6c/0xb0
> > [    2.722535]  rpi_firmware_probe+0xf0/0x1e0
> > [    2.726760]  platform_drv_probe+0x50/0xa0
> > [    2.730879]  really_probe+0xd8/0x438
> > [    2.734567]  driver_probe_device+0xdc/0x130
> > [    2.738870]  __device_attach_driver+0x88/0x108
> > [    2.743434]  bus_for_each_drv+0x78/0xc8
> > [    2.747386]  __device_attach+0xd4/0x158
> > [    2.751337]  device_initial_probe+0x10/0x18
> > [    2.755649]  bus_probe_device+0x90/0x98
> > [    2.759590]  deferred_probe_work_func+0x88/0xd8
> > [    2.764244]  process_one_work+0x1f0/0x3c0
> > [    2.768369]  worker_thread+0x138/0x570
> > [    2.772234]  kthread+0x118/0x120
> > [    2.775571]  ret_from_fork+0x10/0x18
> > [    2.779263] Code: d34cfc00 f2dfbfe2 d37ae400 8b020001 (f8626800)
> > [    2.785492] ---[ end trace 4c435212e349f45f ]---
> > [    2.793340] usb 1-1: New USB device found, idVendor=2109,
> > idProduct=3431, bcdDevice= 4.20
> > [    2.801038] usb 1-1: New USB device strings: Mfr=0, Product=1, 
> > SerialNumber=0
> > [    2.808297] usb 1-1: Product: USB2.0 Hub
> > [    2.813710] hub 1-1:1.0: USB hub found
> > [    2.817117] hub 1-1:1.0: 4 ports detected
> >
> > This is bailing out right here:
> >      
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/firmware/raspberrypi.c?h=v5.6.1#n125
> >
> > FYIW (since I modified the source to actually print what was returned
> > right before it bails) we get:
> >    buf[1] == 0x800000004
> >    buf[2] == 0x00000001
> >
> > Status 0x800000004 is of course suspicious since it is not even listed here:
> >     
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/soc/bcm2835/raspberrypi-firmware.h#n14
> >
> > So it appears that this DMA request path is somehow busted and it
> > would be really nice to figure out why.
>
> You have actually discovered a genuine bug in the recent xen dma rework
> in Linux. Congrats :-)

Nice! ;-)

> I am doing some guesswork here, but from what I read in the thread and
> the information in this email I think this patch might fix the issue.
> If it doesn't fix the issue please add a few printks in
> drivers/xen/swiotlb-xen.c:xen_swiotlb_free_coherent and please let me
> know where exactly it crashes.
>
>
> diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
> index b9cc11e887ed..ff4677ed9788 100644
> --- a/include/xen/arm/page-coherent.h
> +++ b/include/xen/arm/page-coherent.h
> @@ -8,12 +8,17 @@
>  static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t 
> size,
>                 dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
>  {
> +       void *cpu_addr;
> +       if (dma_alloc_from_dev_coherent(hwdev, size, dma_handle, &cpu_addr))
> +               return cpu_addr;
>         return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
>  }
>
>  static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
>                 void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
>  {
> +       if (dma_release_from_dev_coherent(hwdev, get_order(size), cpu_addr))
> +               return;
>         dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
>  }

Applied the patch, but it didn't help and after printk's it turns out
it surprisingly crashes right inside this (rather convoluted if you
ask me) if statement:
    
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/xen/swiotlb-xen.c?h=v5.6.1#n349

So it makes sense that the patch didn't help -- we never hit that
xen_free_coherent_pages.

Thanks,
Roman.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.