[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] linux-3.9-rc0 regression from 3.8 SATA controller not detected under xen



Wednesday, February 27, 2013, 11:22:18 PM, you wrote:


> On 2/27/2013 3:41 PM, Sander Eikelenboom wrote:
>> Wednesday, February 27, 2013, 8:28:10 PM, you wrote:
>>
>>> On Wed, Feb 27, 2013 at 06:50:59PM +0100, Sander Eikelenboom wrote:
>>>> Wednesday, February 27, 2013, 1:54:31 PM, you wrote:
>>>>
>>>>>>>> On 27.02.13 at 12:46, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> wrote:
>>>>>>    [   89.338827] ahci: probe of 0000:00:11.0 failed with error -22
>>>>> Which is -EINVAL. With nothing else printed, I'm afraid you need to
>>>>> find the origin of this return value by instrumenting the involved
>>>>> call tree.
>>>> Just wondering, is multiple msi's per device actually supported by xen ?
>>> That is very good question. I know we support MSI-X b/c 1GB or 10GB NICs
>>> use them and they work great with Xen.
>>> BTW, this is merge:
>>> ommit 5800700f66678ea5c85e7d62b138416070bf7f60
>>> Merge: 266d7ad af8d102
>>> Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>>> Date:   Tue Feb 19 19:07:27 2013 -0800
>>>      Merge branch 'x86-apic-for-linus' of 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>      
>>>      Pull x86/apic changes from Ingo Molnar:
>>>       "Main changes:
>>>      
>>>         - Multiple MSI support added to the APIC, PCI and AHCI code - acked
>>>           by all relevant maintainers, by Alexander Gordeev.
>>>      
>>>           The advantage is that multiple AHCI ports can have multiple MSI
>>>           irqs assigned, and can thus spread to multiple CPUs.
>>>      
>>>           [ Drivers can make use of this new facility via the
>>>             pci_enable_msi_block_auto() method ]
>>
>>
>>> With MSI per device, the hypercall that ends up happening is:
>>> PHYSDEVOP_map_pirq with:
>>>     map_irq.domid = domid;
>>>     map_irq.type = MAP_PIRQ_TYPE_MSI_SEG;
>>>     map_irq.index = -1;
>>>     map_irq.pirq = -1;
>>>     map_irq.bus = dev->bus->number |
>>>                   (pci_domain_nr(dev->bus) << 16);
>>>     map_irq.devfn = dev->devfn;
>>> Which would imply that we are doing this call multiple times?
>>> (This is xen_initdom_setup_msi_irqs).
>>> It looks like pci_enable_msi_block_auto is the multiple MSI one
>>> and it should perculate down to xen_initdom_setup_msi_irqs.
>>> Granted the xen_init.. does not do anything with the 'nvec' call.
>>> So could I ask you try out your hunch by doing three things:
>>>   1). Instrument xen_initdom_setup_msi_irqs to see if the
>>>       nvec has anything but 1 and in its loop instrument to
>>>       see if it has more than on MSI attribute?
>>>   2). The ahci driver has ahci_init_interrupts which only does
>>>     the multiple MSI thing if AHCI_HFLAG_NO_MSI is not set.
>>>      If you edit drivers/ata/ahci ahci_port_info for the SB600 (or 700?)
>>>      to have AHCI_HFLAG_NO_MSI flag (you probably want to do this
>>>      seperatly from 1).
>>>   3). Checkout before merge 5800700f66678ea5c85e7d62b138416070bf7f60
>>>      and try 266d7ad7f4fe2f44b91561f5b812115c1b3018ab?
>>
>> So of interest are commits:
>> - 5ca72c4f7c412c2002363218901eba5516c476b1
>> - 08261d87f7d1b6253ab3223756625a5c74532293
>> - 51906e779f2b13b38f8153774c4c7163d412ffd9
>>
>> Hmmm reading the commit message of 51906e779f2b13b38f8153774c4c7163d412ffd9:
>>
>> x86/MSI: Support multiple MSIs in presense of IRQ remapping
>>
>> The MSI specification has several constraints in comparison with
>> MSI-X, most notable of them is the inability to configure MSIs
>> independently. As a result, it is impossible to dispatch
>> interrupts from different queues to different CPUs. This is
>> largely devalues the support of multiple MSIs in SMP systems.
>>
>> Also, a necessity to allocate a contiguous block of vector
>> numbers for devices capable of multiple MSIs might cause a
>> considerable pressure on x86 interrupt vector allocator and
>> could lead to fragmentation of the interrupt vectors space.
>>
>> This patch overcomes both drawbacks in presense of IRQ remapping
>> and lets devices take advantage of multiple queues and per-IRQ
>> affinity assignments.
>>
>> At least makes clear why baremetal does boot and xen doesn't:
>>
>> Baremetal behaves differently and thus boots because interrupt remapping 
>> gets disabled on boot by the kernel iommu code due to the buggy bios iommu 
>> errata, so according to the commit message above it doesn't even try the 
>> multiple MSI per device scenario.
>>
>> So the question is if it can be enabled in Xen (and if it actually could be 
>> beneficial because the commit messages seems to indicate that could be 
>> questionable).
>> If not, the check in arch/x86/kernel/apic/io_apic.c:setup_msi_irqs should 
>> fail
> Except that function in Xen is not run. that is b/c 
> x86_msi_ops.setup_msi_irqs end up pointing to xen_initdom_setup_irqs. 
> While if IOMMU is enabled it gets set to irq_remapping_setup_msi_irqs.

> So a fix like this:
> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
> index 56ab749..47f8cca 100644
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -263,6 +263,9 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev 
> *dev, int nvec, int type)
>          int ret = 0;
>          struct msi_desc *msidesc;

> +       if (type == PCI_CAP_ID_MSI && nvec > 1)
> +               return 1;
> +
>          list_for_each_entry(msidesc, &dev->msi_list, list) {
>                  struct physdev_map_pirq map_irq;
>                  domid_t domid;


> (sorry about the paste getting messed up here) - ought to do it? As for 
> example on one of my AMD machines there is no IOMMU, and this is where 
> AHCI does work under baremetal but not under Xen.

Yes it boots again :-)

[   37.742109] SE | bus: 'pci': really_probe: probing driver ahci with device 
0000:00:11.0
[   37.773491] really_probe: pinctrl_bind_pins(0000:00:11.0) ret: 0
[   37.798862] ahci 0000:00:11.0: SE | ahci_init_one start
[   37.822040] ahci 0000:00:11.0: version 3.0
[   37.841606] xen: registering gsi 19 triggering 0 polarity 1
[   37.865577] xen: --> pirq=19 -> irq=19 (gsi=19)
[   37.913087] ahci 0000:00:11.0: SE | pcim_enable_device(pdev) rc:0
[   37.938519] ahci 0000:00:11.0: SE pcim_iomap_regions_request_all(pdev, 1 << 
ahci_pci_bar, DRV_NAME)  rc:0
[   37.974447] ahci 0000:00:11.0: xen_initdom_setup_msi_irqs nvec: 4 type:5
[   38.001806] ahci 0000:00:11.0: xen_initdom_setup_msi_irqs nvec: 1 type:5
[   38.029026] ahci 0000:00:11.0: SE pci_enable_msi_block_auto(pdev, &maxvec) 
rc:1
[   38.057960] ahci 0000:00:11.0: SE | n_msis: 1
[   38.078065] ahci 0000:00:11.0: SE | ahci_configure_dma_masks(pdev, 
hpriv->cap & HOST_CAP_64)  rc:0
[   38.112045] ahci 0000:00:11.0: SE | ahci_pci_reset_controller(host)  rc:0
[   38.139426] ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 6 Gbps 0xf 
impl SATA mode
[   38.170664] ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio 
slum part
[   38.201684] ahci 0000:00:11.0: SE | me here 1
[   38.221977] ahci 0000:00:11.0: SE | me here 2
[   38.244756] scsi0 : ahci
[   38.259700] scsi1 : ahci
[   38.274411] scsi2 : ahci
[   38.289278] scsi3 : ahci
[   38.303718] ata1: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff100 
irq 121
[   38.332566] ata2: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff180 
irq 121
[   38.361366] ata3: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff200 
irq 121
[   38.390080] ata4: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff280 
irq 121
[   38.418787] really_probe: dev->bus->probe(0000:00:11.0) ret: 0
[   38.442420] really_probe: 0000:00:11.0 done ret: 1


> We can future wise implement a better version of this to deal with 
> multiple MSIs, but lets make sure to first get it booting.
>> --
>> Sander
>>
>>
>>
>>
>>>> --
>>>> Sander
>>>>
>>>>> Jan
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxx
>>>> http://lists.xen.org/xen-devel
>>>>
>>
>>




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.