[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] linux-3.9-rc0 regression from 3.8 SATA controller not detected under xen



On Thu, Feb 28, 2013 at 12:57:24AM +0100, Sander Eikelenboom wrote:
> 
> Wednesday, February 27, 2013, 11:22:18 PM, you wrote:
> 
> 
> > On 2/27/2013 3:41 PM, Sander Eikelenboom wrote:
> >> Wednesday, February 27, 2013, 8:28:10 PM, you wrote:
> >>
> >>> On Wed, Feb 27, 2013 at 06:50:59PM +0100, Sander Eikelenboom wrote:
> >>>> Wednesday, February 27, 2013, 1:54:31 PM, you wrote:
> >>>>
> >>>>>>>> On 27.02.13 at 12:46, Sander Eikelenboom <linux@xxxxxxxxxxxxxx> 
> >>>>>>>> wrote:
> >>>>>>    [   89.338827] ahci: probe of 0000:00:11.0 failed with error -22
> >>>>> Which is -EINVAL. With nothing else printed, I'm afraid you need to
> >>>>> find the origin of this return value by instrumenting the involved
> >>>>> call tree.
> >>>> Just wondering, is multiple msi's per device actually supported by xen ?
> >>> That is very good question. I know we support MSI-X b/c 1GB or 10GB NICs
> >>> use them and they work great with Xen.
> >>> BTW, this is merge:
> >>> ommit 5800700f66678ea5c85e7d62b138416070bf7f60
> >>> Merge: 266d7ad af8d102
> >>> Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> >>> Date:   Tue Feb 19 19:07:27 2013 -0800
> >>>      Merge branch 'x86-apic-for-linus' of 
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> >>>      
> >>>      Pull x86/apic changes from Ingo Molnar:
> >>>       "Main changes:
> >>>      
> >>>         - Multiple MSI support added to the APIC, PCI and AHCI code - 
> >>> acked
> >>>           by all relevant maintainers, by Alexander Gordeev.
> >>>      
> >>>           The advantage is that multiple AHCI ports can have multiple MSI
> >>>           irqs assigned, and can thus spread to multiple CPUs.
> >>>      
> >>>           [ Drivers can make use of this new facility via the
> >>>             pci_enable_msi_block_auto() method ]
> >>
> >>
> >>> With MSI per device, the hypercall that ends up happening is:
> >>> PHYSDEVOP_map_pirq with:
> >>>     map_irq.domid = domid;
> >>>     map_irq.type = MAP_PIRQ_TYPE_MSI_SEG;
> >>>     map_irq.index = -1;
> >>>     map_irq.pirq = -1;
> >>>     map_irq.bus = dev->bus->number |
> >>>                   (pci_domain_nr(dev->bus) << 16);
> >>>     map_irq.devfn = dev->devfn;
> >>> Which would imply that we are doing this call multiple times?
> >>> (This is xen_initdom_setup_msi_irqs).
> >>> It looks like pci_enable_msi_block_auto is the multiple MSI one
> >>> and it should perculate down to xen_initdom_setup_msi_irqs.
> >>> Granted the xen_init.. does not do anything with the 'nvec' call.
> >>> So could I ask you try out your hunch by doing three things:
> >>>   1). Instrument xen_initdom_setup_msi_irqs to see if the
> >>>       nvec has anything but 1 and in its loop instrument to
> >>>       see if it has more than on MSI attribute?
> >>>   2). The ahci driver has ahci_init_interrupts which only does
> >>>     the multiple MSI thing if AHCI_HFLAG_NO_MSI is not set.
> >>>      If you edit drivers/ata/ahci ahci_port_info for the SB600 (or 700?)
> >>>      to have AHCI_HFLAG_NO_MSI flag (you probably want to do this
> >>>      seperatly from 1).
> >>>   3). Checkout before merge 5800700f66678ea5c85e7d62b138416070bf7f60
> >>>      and try 266d7ad7f4fe2f44b91561f5b812115c1b3018ab?
> >>
> >> So of interest are commits:
> >> - 5ca72c4f7c412c2002363218901eba5516c476b1
> >> - 08261d87f7d1b6253ab3223756625a5c74532293
> >> - 51906e779f2b13b38f8153774c4c7163d412ffd9
> >>
> >> Hmmm reading the commit message of 
> >> 51906e779f2b13b38f8153774c4c7163d412ffd9:
> >>
> >> x86/MSI: Support multiple MSIs in presense of IRQ remapping
> >>
> >> The MSI specification has several constraints in comparison with
> >> MSI-X, most notable of them is the inability to configure MSIs
> >> independently. As a result, it is impossible to dispatch
> >> interrupts from different queues to different CPUs. This is
> >> largely devalues the support of multiple MSIs in SMP systems.
> >>
> >> Also, a necessity to allocate a contiguous block of vector
> >> numbers for devices capable of multiple MSIs might cause a
> >> considerable pressure on x86 interrupt vector allocator and
> >> could lead to fragmentation of the interrupt vectors space.
> >>
> >> This patch overcomes both drawbacks in presense of IRQ remapping
> >> and lets devices take advantage of multiple queues and per-IRQ
> >> affinity assignments.
> >>
> >> At least makes clear why baremetal does boot and xen doesn't:
> >>
> >> Baremetal behaves differently and thus boots because interrupt remapping 
> >> gets disabled on boot by the kernel iommu code due to the buggy bios iommu 
> >> errata, so according to the commit message above it doesn't even try the 
> >> multiple MSI per device scenario.
> >>
> >> So the question is if it can be enabled in Xen (and if it actually could 
> >> be beneficial because the commit messages seems to indicate that could be 
> >> questionable).
> >> If not, the check in arch/x86/kernel/apic/io_apic.c:setup_msi_irqs should 
> >> fail
> > Except that function in Xen is not run. that is b/c 
> > x86_msi_ops.setup_msi_irqs end up pointing to xen_initdom_setup_irqs. 
> > While if IOMMU is enabled it gets set to irq_remapping_setup_msi_irqs.
> 
> > So a fix like this:
> > diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
> > index 56ab749..47f8cca 100644
> > --- a/arch/x86/pci/xen.c
> > +++ b/arch/x86/pci/xen.c
> > @@ -263,6 +263,9 @@ static int xen_initdom_setup_msi_irqs(struct pci_dev 
> > *dev, int nvec, int type)
> >          int ret = 0;
> >          struct msi_desc *msidesc;
> 
> > +       if (type == PCI_CAP_ID_MSI && nvec > 1)
> > +               return 1;
> > +
> >          list_for_each_entry(msidesc, &dev->msi_list, list) {
> >                  struct physdev_map_pirq map_irq;
> >                  domid_t domid;
> 
> 
> > (sorry about the paste getting messed up here) - ought to do it? As for 
> > example on one of my AMD machines there is no IOMMU, and this is where 
> > AHCI does work under baremetal but not under Xen.
> 
> Yes it boots again :-)

Great! Are you OK if I put 'Reported-and-Tested-by:" tag on the patch with your
name for the above quick fix?

Thanks!
> 
> [   37.742109] SE | bus: 'pci': really_probe: probing driver ahci with device 
> 0000:00:11.0
> [   37.773491] really_probe: pinctrl_bind_pins(0000:00:11.0) ret: 0
> [   37.798862] ahci 0000:00:11.0: SE | ahci_init_one start
> [   37.822040] ahci 0000:00:11.0: version 3.0
> [   37.841606] xen: registering gsi 19 triggering 0 polarity 1
> [   37.865577] xen: --> pirq=19 -> irq=19 (gsi=19)
> [   37.913087] ahci 0000:00:11.0: SE | pcim_enable_device(pdev) rc:0
> [   37.938519] ahci 0000:00:11.0: SE pcim_iomap_regions_request_all(pdev, 1 
> << ahci_pci_bar, DRV_NAME)  rc:0
> [   37.974447] ahci 0000:00:11.0: xen_initdom_setup_msi_irqs nvec: 4 type:5
> [   38.001806] ahci 0000:00:11.0: xen_initdom_setup_msi_irqs nvec: 1 type:5
> [   38.029026] ahci 0000:00:11.0: SE pci_enable_msi_block_auto(pdev, &maxvec) 
> rc:1
> [   38.057960] ahci 0000:00:11.0: SE | n_msis: 1
> [   38.078065] ahci 0000:00:11.0: SE | ahci_configure_dma_masks(pdev, 
> hpriv->cap & HOST_CAP_64)  rc:0
> [   38.112045] ahci 0000:00:11.0: SE | ahci_pci_reset_controller(host)  rc:0
> [   38.139426] ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 6 Gbps 0xf 
> impl SATA mode
> [   38.170664] ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp 
> pio slum part
> [   38.201684] ahci 0000:00:11.0: SE | me here 1
> [   38.221977] ahci 0000:00:11.0: SE | me here 2
> [   38.244756] scsi0 : ahci
> [   38.259700] scsi1 : ahci
> [   38.274411] scsi2 : ahci
> [   38.289278] scsi3 : ahci
> [   38.303718] ata1: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff100 
> irq 121
> [   38.332566] ata2: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff180 
> irq 121
> [   38.361366] ata3: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff200 
> irq 121
> [   38.390080] ata4: SATA max UDMA/133 abar m1024@0xf96ff000 port 0xf96ff280 
> irq 121
> [   38.418787] really_probe: dev->bus->probe(0000:00:11.0) ret: 0
> [   38.442420] really_probe: 0000:00:11.0 done ret: 1
> 
> 
> > We can future wise implement a better version of this to deal with 
> > multiple MSIs, but lets make sure to first get it booting.
> >> --
> >> Sander
> >>
> >>
> >>
> >>
> >>>> --
> >>>> Sander
> >>>>
> >>>>> Jan
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Xen-devel mailing list
> >>>> Xen-devel@xxxxxxxxxxxxx
> >>>> http://lists.xen.org/xen-devel
> >>>>
> >>
> >>
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.