[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Problems after enabling rcv/xmit interrupts of ns16550 on OMAP5



On Jul 18, 2013, at 7:53 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:

> On Thu, 2013-07-18 at 00:05 +0800, Chen Baozi wrote:
>> å 2013-7-17ï23:26ïIan Campbell <Ian.Campbell@xxxxxxxxxx> åéï
>> 
>>>>>> "restoring CPSR" refers to the instruction "msr CPSR_c, <reg>" which
>>>>>> is from "local_irq_restore". And "cpsie i" is from the call to
>>>>>> local_irq_enable".
>>>>> 
>>>>> Ah right. So in both cases you will immediately take any pending
>>>>> interrupt. I think I would continue instrumenting starting from
>>>>> gic_interrupt() and hopefully eventually into the ns16550 interrupt
>>>>> handler.
>>>> 
>>>> I went through gic_interrupt() and thought got the points cause the stuck.
>>> 
>>> Please can you clarify exactly what you mean by "stuck". Previously you
>>> thought it was stuck in ns16550_setup_postirq when in actual fact it was
>>> taking an interrupt.
>> 
>> I thought it was "stuck" because since every time I pressed 'd' to
>> dump the registers the PC always stayed at the same position during
>> executing ns16550_setup_postirq. So it really looks like that that the
>> system get stuck at that point. Sorry if I made a wrong description.
> 
> No problem. In fact if 'd' works perhaps you are not blowing the stack
> at all with multiple interrupts.
> 
> Ah, you are probably never escaping the loop in gic_interrupt because
> the read of IAR always returns the UART interrupt.
> 
>> 
>>> Are you sure that you are taking multiple,
>>> potentially nested interrupts and eventually blowing the hypervisor
>>> stack? This seems like the most likely scenario to me.
>> 
>> Seems reasonable. Is there any way to prove that we are under this
>> situation? I didn't expect this possibility before. Thanks.
> 
> I was about to say that a printk in gic_interrupt ought to confirm, but
> since the UART IRQ is the problem perhaps that isn't so obvious, unless
> sync_console helps in some way. Worth a try.
> 
> If not then since 'd' works then perhaps you could keep a count of the
> number serial IRQs in a global var and dump it?

Thanks. I'll have a try.

> 
>> 
>>> 
>>>> If I change the while(...) in ns16550_interrupt() into if(...) and comment
>>>> either "GICC[GICC_EOIR] = irq;" or "GICC[GICC_DIR] = irq;" in
>>>> git_host_irq_end(), it won't get stuck after enabling receive and transmit
>>>> interrupts in ns16550_setup_postirq().
>>> 
>>> By removing the writes to either EOIR or DIR you are in effect never
>>> unmasking the interrupt, so you avoid the nest interrupt problem.
>>> 
>>> If this is the case then real issue is perhaps that for whatever reason
>>> ns16550_interrupt is not causing the hardware to deassert its interrupt
>>> line.
>>> 
>>> The UART on the sunxi is compatible (in DTS terms) with
>>> "snps,dw-apb-uart", which seems to be an 8250 variant, but one which
>>> differs enough to warrant its own compatibility string -- perhaps Xen's
>>> ns16550 driver isn't dealing with some quirk of this device?
>> 
>> I checked my OMAP5's data sheet. Generally, they looks very similar.
>> But I will read the manual more carefully again tomorrow to make sure
>> this point.
> 
> Good idea.
> 
>> 
>>> 
>>> It seems like the driver in Linux is drivers/tty/serial/8250/8250_dw.c.
>>> dw8250_handle_irq looks interesting...
>>> 
>>>       struct dw8250_data *d = p->private_data;
>>>       unsigned int iir = p->serial_in(p, UART_IIR);
>>> 
>>>       if (serial8250_handle_irq(p, iir)) {
>>>               return 1;
>>>       } else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
>>>               /* Clear the USR and write the LCR again. */
>>>               (void)p->serial_in(p, DW_UART_USR);
>>>               p->serial_out(p, UART_LCR, d->last_lcr);
>>> 
>>>               return 1;
>>>       }
>>> 
>>>       return 0;
>>> 
>>> In particular the fallback code there when the common 8250 handler
>>> didn't deal with the issue...
>> 
>> I'll get down to the Linux driver tomorrow to see whether I could catch the 
>> point.
> 
> Actually, the comment at the top is interesting:
> 12  * The Synopsys DesignWare 8250 has an extra feature whereby it detects if 
> the
> 13  * LCR is written whilst busy.  If it is, then a busy detect interrupt is
> 14  * raised, the LCR needs to be rewritten and the uart status register read.
> 
> I'm not sure that "extra feature" doesn't mean "weird quirk" but there we go 
> ;-)
> 
> The changelog of the patch which added it is interesting too:
> http://permalink.gmane.org/gmane.linux.serial/5855

I checked the Linux driver today. Since the UART of my OMAP5432 board is 
compatible with "ti,omap4-uart", the driver in Linux should be 
drivers/tty/serial/omap-serial.c rather than drivers/tty/serial/8250/8250_dw.c.

In serial_omap_irq of omap-serial.c, there are no such fallback codes as 
DesignWare's. However, it does check the modem status register. I used to think 
this would be the point, because "Modem Status" interrupt must be cleared by 
reading the modem status register. However, it seems reading this register 
doesn't work :-(

Baozi



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.