[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] PCI uart: fix boot hang, and second S3 resume inactive timer list corruption



>>> On 26.08.13 at 13:39, Tomasz Wroblewski <tomasz.wroblewski@xxxxxxxxxx> 
>>> wrote:
> On 08/26/2013 01:17 PM, Jan Beulich wrote:
>>>>> On 26.08.13 at 11:17, Tomasz Wroblewski<tomasz.wroblewski@xxxxxxxxxx>  
>>>>> wrote:
>>> - fix occasional xen boot hang whilst using PCI uart. Dom0 kernel disables 
> ioport responses
>>>   during PCI system initialisation, causing xen hang if __ns16550_poll() 
> routine happens to
>>>   be scheduled during that time. Detect and exit. Amended 
> ns16550_ioport_invalid function
>>>   to only check IER register, which contains three reservered (always 0) 
> bits, therefore
>>>   it's sufficient for this test.
>> And this was observed with 4.4-unstable? I'm asking because I
>> would at a first glance have thought that taking care of this
>> ought to be a desirable side effect of calling pci_hide_device().
> This was observed with stable 4.3 - it seems to be doing the 
> pci_hide_device as well, so I don't think this affects, or was it 
> bugfixed later? I'm not entirely sure how is pci_hide_device supposed to 
> work though - in my dom0, on 4.3, I am seeing the pci serial card used 
> by xen console, so maybe it is bugged? (or i misunderstand it).

Wait, yes, pci_ro_device() is what would be needed to drop
Dom0 writes to the device's config space. But we don't want
this if at all possible, as there may be other devices (more
serial ports and/or one or more parallel ports) on the same
card, and we want to allow Dom0 to drive those.

Nevertheless, the approach of your patch in simply giving up
the device (even if only termporarily) looks questionable to me
We'd rather need to restore full access to it I would think. But
yes, this hypervisor and Dom0 playing with the same device is
sort of a gray area.

>>> +static int ns16550_ioport_invalid(struct ns16550 *uart)
>>> +{
>>> +    return (((unsigned char)ns_read_reg(uart, UART_IER)) == 0xff);
>>> +}
>> Why checking just one register is sufficient when originally
>>
>>> -static int ns16550_ioport_invalid(struct ns16550 *uart)
>>> -{
>>> -    return ((((unsigned char)ns_read_reg(uart, UART_LSR)) == 0xff)&&
>>> -            (((unsigned char)ns_read_reg(uart, UART_MCR)) == 0xff)&&
>>> -            (((unsigned char)ns_read_reg(uart, UART_IER)) == 0xff)&&
>>> -            (((unsigned char)ns_read_reg(uart, UART_IIR)) == 0xff)&&
>>> -            (((unsigned char)ns_read_reg(uart, UART_LCR)) == 0xff));
>>> -}
>> we checked five also needs some better explanation.
> I believe it's enough to test IER register since it contains 3 reserved 
> bits which are always 0 during normal operation, therefore the condition 
> will never hit then. Made this as a mini optimisation since this 
> function would now be called more frequently.

I assumed it was something like this. But that needs to be said in
the patch description.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.