[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xen Arm vpl011 UART will cause segmentation fault in Linux guest



On Thu, 10 Nov 2022 12:32:49 -0800 (PST)
Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:

Hi,

> On Wed, 9 Nov 2022, Michal Orzel wrote:
> > Hi Jiamei,
> > 
> > On 09/11/2022 09:25, Jiamei Xie wrote:  
> > > 
> > > 
> > > Hi Michal,
> > > 
> > > Below log can be got when stating the linux guest. It says 9c09 is sbsa. 
> > > And 9c09 is also output
> > >  in bootlogd error message:
> > > Serial: AMBA PL011 UART driver
> > > 9c0b0000.uart: ttyAMA0 at MMIO 0x9c0b0000 (irq = 12, base_baud = 0) is a 
> > > PL011 rev2
> > > printk: console [ttyAMA0] enabled
> > > 9c090000.sbsa-uart: ttyAMA1 at MMIO 0x9c090000 (irq = 15, base_baud = 0) 
> > > is a SBSA
> > >   
> > 
> > Xen behavior is correct and this would be Linux fault to try to write to 
> > DMACR for SBSA UART device.
> > DMACR is just an example. If you try to program e.g. the baudrate (through 
> > LCR) for VPL011 it will
> > also result in injecting abort into the guest. Should Xen support it? No. 
> > The reason why is that
> > it is not spec compliant operation. SBSA specification directly specifies 
> > what registers are exposed.
> > If Linux tries to write to some of the none-spec compliant registers - it 
> > is its fault.  
> 
> Yeah, we need to fix Linux.

Yes, it's a bug in Linux, and nobody noticed because most SBSA UARTs are
actual PL011s, just not with everything wired up and the clocks fixed.

But while you can take pride all day in Xen having a perfect
spec-compliant implementation - and you would be right - you have to face
the reality that existing Linux kernels will crash.
So we will add the one-liner in Linux that fixes that issue, and this will
probably be backported to stable kernels, but you will still encounter
kernels without the fix in the wild.
So I wonder if you should consider to just implement the other PL011
registers as RAZ/WI? That would be spec compliant as well (since an actual
PL011 is also a spec-compliant SBSA-UART), but would work either way. Of
course you don't need to implement the DMA or baudrate functionality, but
at least not be nasty and trap on those register accesses.

Cheers,
Andre

> FYI this is not the first bug in Linux affecting the sbsa-uart driver:
> the issue is that the pl011 driver and the sbsa-uart driver share the
> same code in Linux so it happens sometimes that a pl011-only feature
> creeps into the sbsa-uart driver by mistake.
> 
> 
> > >> -----Original Message-----
> > >> From: Michal Orzel <michal.orzel@xxxxxxx>
> > >> Sent: Wednesday, November 9, 2022 3:40 PM
> > >> To: Jiamei Xie <Jiamei.Xie@xxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
> > >> Cc: Wei Chen <Wei.Chen@xxxxxxx>; Bertrand Marquis
> > >> <Bertrand.Marquis@xxxxxxx>; julien@xxxxxxx; sstabellini@xxxxxxxxxx
> > >> Subject: Re: Xen Arm vpl011 UART will cause segmentation fault in Linux
> > >> guest
> > >>
> > >> Hi Jiamei,
> > >>
> > >> On 09/11/2022 08:20, Jiamei Xie wrote:  
> > >>>
> > >>>
> > >>> Hi all,
> > >>>
> > >>> When the guest kernel enables DMA engine with  
> > >> "CONFIG_DMA_ENGINE=y", Linux AMBA PL011 driver will access PL011
> > >> DMACR register. But this register have not been supported by vpl011 of 
> > >> Xen.
> > >> Xen will inject a data abort into guest, this will cause segmentation 
> > >> fault of
> > >> guest with the below message:
> > >> I am quite confused.
> > >> VPL011 implements SBSA UART which only implements some subset of PL011
> > >> operations (SBSA UART is not PL011).
> > >> According to spec (SBSA ver. 6.0), the SBSA_UART does not support DMA
> > >> features so Xen code is fine.
> > >> When Xen exposes vpl011 device to a guest, this device has 
> > >> "arm,sbsa-uart"
> > >> compatible and not "uart-pl011".
> > >> Linux driver "amba-pl011.c" should see this compatible and assign proper
> > >> operations (sbsa_uart_pops instead of amba_pl011_pops) that do not enable
> > >> DMA.
> > >> Maybe the issue is with your configuration?
> > >>
> > >> ~Michal
> > >>  
> > >>> Unhandled fault at 0xffffffc00944d048
> > >>> Mem abort info:
> > >>> ESR = 0x96000000
> > >>> EC = 0x25: DABT (current EL), IL = 32 bits
> > >>> SET = 0, FnV = 0
> > >>> EA = 0, S1PTW = 0
> > >>> FSC = 0x00: ttbr address size fault
> > >>> Data abort info:
> > >>> ISV = 0, ISS = 0x00000000
> > >>> CM = 0, WnR = 0
> > >>> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
> > >>> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803,  
> > >> pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13  
> > >>> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
> > >>> Modules linked in:
> > >>> CPU: 0 PID: 132 Comm: bootlogd Not tainted 5.15.44-yocto-standard #1
> > >>> pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > >>> pc : pl011_stop_rx+0x70/0x80
> > >>> lr : uart_tty_port_shutdown+0x44/0x110
> > >>> sp : ffffffc00999bba0
> > >>> x29: ffffffc00999bba0 x28: ffffff80234ac380 x27: ffffff8022f5d000
> > >>> x26: 0000000000000000 x25: 0000000045585401 x24: 0000000000000000
> > >>> x23: ffffff8021ba4660 x22: 0000000000000001 x21: ffffff8021a0e2a0
> > >>> x20: ffffff802198f880 x19: ffffff8021a0e1a0 x18: 0000000000000000
> > >>> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > >>> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > >>> x11: 0000000000000000 x10: 0000000000000000 x9 : ffffffc00871ba14
> > >>> x8 : ffffffc0099de260 x7 : ffffff8021a0e318 x6 : 0000000000000003
> > >>> x5 : ffffffc009315f20 x4 : ffffffc00944d038 x3 : 0000000000000000
> > >>> x2 : ffffffc00944d048 x1 : 0000000000000000 x0 : 0000000000000048
> > >>> Call trace:
> > >>> pl011_stop_rx+0x70/0x80
> > >>> tty_port_shutdown+0x7c/0xb4
> > >>> tty_port_close+0x60/0xcc
> > >>> uart_close+0x34/0x8c
> > >>> tty_release+0x144/0x4c0
> > >>> __fput+0x78/0x220
> > >>> ____fput+0x1c/0x30
> > >>> task_work_run+0x88/0xc0
> > >>> do_notify_resume+0x8d0/0x123c
> > >>> el0_svc+0xa8/0xc0
> > >>> el0t_64_sync_handler+0xa4/0x130
> > >>> el0t_64_sync+0x1a0/0x1a4
> > >>> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
> > >>> ---[ end trace 83dd93df15c3216f ]---
> > >>> note: bootlogd[132] exited with preempt_count 1
> > >>> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-  
> > >> daemon  
> > >>> In Xen, vpl011_mmio_write doesn't handle DMACR . And kernel doesn't  
> > >> check if pl011_write executes sucessfully in pl011_dma_rx_stop . So such
> > >> segmentation fault occurs.  
> > >>> static inline void pl011_dma_rx_stop(struct uart_amba_port *uap)
> > >>> {
> > >>>         /* FIXME.  Just disable the DMA enable */
> > >>>         uap->dmacr &= ~UART011_RXDMAE;
> > >>>         pl011_write(uap->dmacr, uap, REG_DMACR);
> > >>> }
> > >>>
> > >>> I think we should prevent such segmentation fault. We have checked the  
> > >> PL011 spec, it seems there is not any register bit can indicate DMA 
> > >> support
> > >> status of PL011. We might have two options:  
> > >>> 1. Option#1 is to add DMA support for vpl011, but this is not trivial.
> > >>> 2. Option#2 is to ignore the write to DMACR, and return 0 for DMACR 
> > >>> read  
> > >> in vpl011. But this option need co-work with kernel, because current 
> > >> Linux
> > >> PL011 driver assume the write operation will never be failed, and will 
> > >> not
> > >> fallback to no-DMA mode, when Xen return 0 for DMA enabled bit in DMACR. 
> > >>  
> > >>>
> > >>> How do you think about it?  Any suggestion about it is welcome. Thanks.
> > >>>
> > >>> Best wishes
> > >>> Jiamei Xie
> > >>>  
> >   




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.