[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 26/27 v12] arm/xen: vpl011: Fix the slow early console SBSA UART output
On Tue, Oct 17, 2017 at 07:53:36AM -0500, Rob Herring wrote: > On Tue, Oct 17, 2017 at 6:19 AM, Dave Martin <Dave.Martin@xxxxxxx> wrote: > > On Tue, Oct 17, 2017 at 10:51:07AM +0100, Andre Przywara wrote: > >> Hi Bhupinder, > >> > >> first thing: As the bulk of the series has been merged now, please > >> restart your patch and version numbering, so a (potential) next post > >> should be prefixed [PATCH v3 1/2]. And please have a cover letter giving > >> a brief overview what this series fixes. > >> > >> On 13/10/17 11:40, Bhupinder Thakur wrote: > >> > The early console output uses pl011_early_write() to write data. This > >> > function waits for BUSY bit to get cleared before writing the next byte. > >> > >> ... which is questionable given the actual definition of the BUSY bit in > >> the PL011 TRM: > >> > >> ============ > >> .... The BUSY signal goes HIGH as soon as data is written to the > >> transmit FIFO (that is, the FIFO is non-empty) and remains asserted > >> HIGH while data is being transmitted. BUSY is negated only when the > >> transmit FIFO is empty, and the last character has been transmitted from > >> the shift register, .... > >> ============ > >> > >> (I take it you are talking about the Linux driver in a guest here). > >> I think the early_write routine tries to (deliberately?) ignore the > >> FIFO, possibly to make sure characters really get pushed out before a > >> system crashes, maybe. > >> > >> > > >> > In the SBSA UART emulation logic, the BUSY bit was set as soon one > >> > byte was written in the FIFO and it remained set until the FIFO was > >> > emptied. > >> > >> Which is correct behaviour, as this matches the PL011 TRM as quoted above. > >> > >> > This meant that the output was delayed as each character needed > >> > the BUSY to get cleared. > >> > >> But this is true as well! > >> > >> > Since the SBSA UART is getting emulated in Xen using ring buffers, it > >> > ensures that once the data is enqueued in the FIFO, it will be received > >> > by xenconsole so it is safe to set the BUSY bit only when FIFO becomes > >> > full. This will ensure that pl011_early_write() is not delayed unduly > >> > to write the data. > >> > >> So I can confirm that this patch fixes the very slow earlycon output > >> observed with the current staging HEAD. > >> > >> So while this is somewhat deviating from the spec, I can see the benefit > >> for an emulation scenario. I believe that emulations in general might > >> choose implementing things a bit differently, to cope with the > >> fundamental differences in their environment, like the virtually endless > >> "FIFO" and the lack of any timing restrictions on the emulated "wire". > >> > >> So unless someone comes up with a better solution, I would support > >> taking this patch, as this fixes a real problem. > > > > I think you get away with this, but it does violate the spec in order > > to work around a feature of a correctly implemented driver. > > > > Software can now see this, for example: > > > > uart_write(ch, UARTDR); > > busy = uart_read(UARTFR) & UARTFR_BUSY; > > BUG_ON(!(uart_read(UARTFR) & UARTFR_TXFE) && !busy); > > > > which violates the spec, though I can't currently think of a good reason > > for software to rely on that. > > > > > > [+Rob, who wrote the original earlycon code in the amba-pl011 driver: > > 0d3c673e7881 ("tty/serial: pl011: add generic earlycon support") > > > > Is there any actualy reason why we poll for !BUSY after each char in > > pl011_putc()? pl011_putc() is not exposed at all: it's only called by > > pl011_console_write(). > > > > This will result in stuttering output even on hardware, though this > > doesn't typically matter. > > > > I think if the poll for !BUSY were moved to the end of > > pl011_console_write(), the effect would be much less bad.] > > I just copied the code from the arm64 earlyprintk code... Maybe it was > because on simulation (which was the main platform at the time) folks > wanted the character "on the wire". It seems to be that you could just > drop it. Hmmm, the arm64 earlyprintk code 2475ff9d2c6e ("arm64: Add simple earlyprintk support") looks to have been derived by Catalin from arm's assembly printch/ printascii implementation, which predates git AFACT: (Catalin, pleaes put me right if I misunderstood the history.) arch/arm/kernel/debug.S: ENTRY(printascii) addruart_current r3, r1, r2 b 2f 1: waituart r2, r3 senduart r1, r3 busyuart r2, r3 teq r1, #'\n' moveq r1, #'\r' beq 1b 2: teq r0, #0 ldrneb r1, [r0], #1 teqne r1, #0 bne 1b ret lr ENDPROC(printascii) ENTRY(printch) addruart_current r3, r1, r2 mov r1, r0 mov r0, #0 b 1b ENDPROC(printch) Russell, do you know why we wait for the UART transmitter to go completely idle before queueing a new char? For an individual printch this can makes sense, but it also introduces delay for every char in printascii. This seems to interact interestingly with virtualised UARTs, because we may thrash between the guest and hypervisor per-char, though there may be a way to reduce the impact of this on the emulation side. (See above for some context) In the pl011 earlycon code that was derived from arm64 earlycon (the latter now deceased), pl011_putc() is not exposed at all and polling for !BUSY other than at the end of pl011_early_write() seems unnecessary... Crashing the platform so hard that the PL011 stops transmitting is likely to be challenging -- e.g., turning off some clock or regulator, making the bus lock up etc. None of these is likely to be triggered by pl011_early_write() itself. Cheers ---Dave _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |