[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] linux-3.9-rc0 regression from 3.8 SATA controller not detected under xen



Thursday, February 28, 2013, 2:52:20 PM, you wrote:

> On Thu, Feb 28, 2013 at 12:57:24AM +0100, Sander Eikelenboom wrote:
>> 
>> Wednesday, February 27, 2013, 11:22:18 PM, you wrote:
>> 
>> 
>> > On 2/27/2013 3:41 PM, Sander Eikelenboom wrote:
>> >> Wednesday, February 27, 2013, 8:28:10 PM, you wrote:
>> >>
>> >>> On Wed, Feb 27, 2013 at 06:50:59PM +0100, Sander Eikelenboom wrote:
>> >>>> Wednesday, February 27, 2013, 1:54:31 PM, you wrote:
>> >>>>
>> >>>>>>>> On 27.02.13 at 12:46, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> 
>> >>>>>>>> wrote:
>> >>>>>>    [   89.338827] ahci: probe of 0000:00:11.0 failed with error -22
>> >>>>> Which is -EINVAL. With nothing else printed, I'm afraid you need to
>> >>>>> find the origin of this return value by instrumenting the involved
>> >>>>> call tree.
>> >>>> Just wondering, is multiple msi's per device actually supported by xen ?
>> >>> That is very good question. I know we support MSI-X b/c 1GB or 10GB NICs
>> >>> use them and they work great with Xen.
>> >>> BTW, this is merge:
>> >>> ommit 5800700f66678ea5c85e7d62b138416070bf7f60
>> >>> Merge: 266d7ad af8d102
>> >>> Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>> >>> Date:   Tue Feb 19 19:07:27 2013 -0800
>> >>>      Merge branch 'x86-apic-for-linus' of 
>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>> >>>      
>> >>>      Pull x86/apic changes from Ingo Molnar:
>> >>>       "Main changes:
>> >>>      
>> >>>         - Multiple MSI support added to the APIC, PCI and AHCI code - 
>> >>> acked
>> >>>           by all relevant maintainers, by Alexander Gordeev.
>> >>>      
>> >>>           The advantage is that multiple AHCI ports can have multiple MSI
>> >>>           irqs assigned, and can thus spread to multiple CPUs.
>> >>>      
>> >>>           [ Drivers can make use of this new facility via the
>> >>>             pci_enable_msi_block_auto() method ]
>> >>
>> >>
>> >>> With MSI per device, the hypercall that ends up happening is:
>> >>> PHYSDEVOP_map_pirq with:
>> >>>     map_irq.domid = domid;
>> >>>     map_irq.type = MAP_PIRQ_TYPE_MSI_SEG;
>> >>>     map_irq.index = -1;
>> >>>     map_irq.pirq = -1;
>> >>>     map_irq.bus = dev->bus->number |
>> >>>                   (pci_domain_nr(dev->bus) << 16);
>> >>>     map_irq.devfn = dev->devfn;
>> >>> Which would imply that we are doing this call multiple times?
>> >>> (This is xen_initdom_setup_msi_irqs).
>> >>> It looks like pci_enable_msi_block_auto is the multiple MSI one
>> >>> and it should perculate down to xen_initdom_setup_msi_irqs.
>> >>> Granted the xen_init.. does not do anything with the 'nvec' call.
>> >>> So could I ask you try out your hunch by doing three things:
>> >>>   1). Instrument xen_initdom_setup_msi_irqs to see if the
>> >>>       nvec has anything but 1 and in its loop instrument to
>> >>>       see if it has more than on MSI attribute?
>> >>>   2). The ahci driver has ahci_init_interrupts which only does
>> >>>     the multiple MSI thing if AHCI_HFLAG_NO_MSI is not set.
>> >>>      If you edit drivers/ata/ahci ahci_port_info for the SB600 (or 700?)
>> >>>      to have AHCI_HFLAG_NO_MSI flag (you probably want to do this
>> >>>      seperatly from 1).
>> >>>   3). Checkout before merge 5800700f66678ea5c85e7d62b138416070bf7f60
>> >>>      and try 266d7ad7f4fe2f44b91561f5b812115c1b3018ab?
>> >>
>> >> So of interest are commits:
>> >> - 5ca72c4f7c412c2002363218901eba5516c476b1
>> >> - 08261d87f7d1b6253ab3223756625a5c74532293
>> >> - 51906e779f2b13b38f8153774c4c7163d412ffd9
>> >>
>> >> Hmmm reading the commit message of 
>> >> 51906e779f2b13b38f8153774c4c7163d412ffd9:
>> >>
>> >> x86/MSI: Support multiple MSIs in presense of IRQ remapping
>> >>
>> >> The MSI specification has several constraints in comparison with
>> >> MSI-X, most notable of them is the inability to configure MSIs
>> >> independently. As a result, it is impossible to dispatch
>> >> interrupts from different queues to different CPUs. This is
>> >> largely devalues the support of multiple MSIs in SMP systems.
>> >>
>> >> Also, a necessity to allocate a contiguous block of vector
>> >> numbers for devices capable of multiple MSIs might cause a
>> >> considerable pressure on x86 interrupt vector allocator and
>> >> could lead to fragmentation of the interrupt vectors space.
>> >>
>> >> This patch overcomes both drawbacks in presense of IRQ remapping
>> >> and lets devices take advantage of multiple queues and per-IRQ
>> >> affinity assignments.
>> >>
>> >> At least makes clear why baremetal does boot and xen doesn't:
>> >>
>> >> Baremetal behaves differently and thus boots because interrupt remapping 
>> >> gets disabled on boot by the kernel iommu code due to the buggy bios 
>> >> iommu errata, so according to the commit message above it doesn't even 
>> >> try the multiple MSI per device scenario.
>> >>
>> >> So the question is if it can be enabled in Xen (and if it actually could 
>> >> be beneficial because the commit messages seems to indicate that could be 
>> >> questionable).
>> >> If not, the check in arch/x86/kernel/apic/io_apic.c:setup_msi_irqs should 
>> >> fail
>> > Except that function in Xen is not run. that is b/c 
>> > x86_msi_ops.setup_msi_irqs end up pointing to xen_initdom_setup_irqs. 
>> > While if IOMMU is enabled it gets set to irq_remapping_setup_msi_irqs.
>> 
>> > So a fix like this:
>> > diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
>> > index 56ab749..47f8cca 100644
>> > --- a/arch/x86/pci/xen.c
>> > +++ b/arch/x86/pci/xen.c
>> > @@ -263,6 +263,9 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev 
>> > *dev, int nvec, int type)
>> >          int ret = 0;
>> >          struct msi_desc *msidesc;
>> 
>> > +       if (type == PCI_CAP_ID_MSI && nvec > 1)
>> > +               return 1;
>> > +
>> >          list_for_each_entry(msidesc, &dev->msi_list, list) {
>> >                  struct physdev_map_pirq map_irq;
>> >                  domid_t domid;
>> 
>> 
>> > (sorry about the paste getting messed up here) - ought to do it? As for 
>> > example on one of my AMD machines there is no IOMMU, and this is where 
>> > AHCI does work under baremetal but not under Xen.
>> 
>> Yes it boots again :-)

> Great! Are you OK if I put 'Reported-and-Tested-by:" tag on the patch with 
> your
> name for the above quick fix?

Sure !

> Thanks!
>> 
>> [   37.742109] SE | bus: 'pci': really_probe: probing driver ahci with 
>> device 0000:00:11.0
>> [   37.773491] really_probe: pinctrl_bind_pins(0000:00:11.0) ret: 0
>> [   37.798862] ahci 0000:00:11.0: SE | ahci_init_one start
>> [   37.822040] ahci 0000:00:11.0: version 3.0
>> [   37.841606] xen: registering gsi 19 triggering 0 polarity 1
>> [   37.865577] xen: --> pirq=19 -> irq=19 (gsi=19)
>> [   37.913087] ahci 0000:00:11.0: SE | pcim_enable_device(pdev) rc:0
>> [   37.938519] ahci 0000:00:11.0: SE pcim_iomap_regions_request_all(pdev, 1 
>> << ahci_pci_bar, DRV_NAME)  rc:0
>> [   37.974447] ahci 0000:00:11.0: xen_initdom_setup_msi_irqs nvec: 4 type:5
>> [   38.001806] ahci 0000:00:11.0: xen_initdom_setup_msi_irqs nvec: 1 type:5
>> [   38.029026] ahci 0000:00:11.0: SE pci_enable_msi_block_auto(pdev, 
>> &maxvec) rc:1
>> [   38.057960] ahci 0000:00:11.0: SE | n_msis: 1
>> [   38.078065] ahci 0000:00:11.0: SE | ahci_configure_dma_masks(pdev, 
>> hpriv->cap & HOST_CAP_64)  rc:0
>> [   38.112045] ahci 0000:00:11.0: SE | ahci_pci_reset_controller(host)  rc:0
>> [   38.139426] ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 6 Gbps 0xf 
>> impl SATA mode
>> [   38.170664] ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp 
>> pio slum part
>> [   38.201684] ahci 0000:00:11.0: SE | me here 1
>> [   38.221977] ahci 0000:00:11.0: SE | me here 2
>> [   38.244756] scsi0 : ahci
>> [   38.259700] scsi1 : ahci
>> [   38.274411] scsi2 : ahci
>> [   38.289278] scsi3 : ahci
>> [   38.303718] ata1: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff100 
>> irq 121
>> [   38.332566] ata2: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff180 
>> irq 121
>> [   38.361366] ata3: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff200 
>> irq 121
>> [   38.390080] ata4: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff280 
>> irq 121
>> [   38.418787] really_probe: dev->bus->probe(0000:00:11.0) ret: 0
>> [   38.442420] really_probe: 0000:00:11.0 done ret: 1
>> 
>> 
>> > We can future wise implement a better version of this to deal with 
>> > multiple MSIs, but lets make sure to first get it booting.
>> >> --
>> >> Sander
>> >>
>> >>
>> >>
>> >>
>> >>>> --
>> >>>> Sander
>> >>>>
>> >>>>> Jan
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Xen-devel mailing list
>> >>>> Xen-devel@xxxxxxxxxxxxx
>> >>>> http://lists.xen.org/xen-devel
>> >>>>
>> >>
>> >>
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel
>> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.