[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] MSI badness in xen-unstable


  • To: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
  • From: Bruce Edge <bruce.edge@xxxxxxxxx>
  • Date: Sat, 16 Oct 2010 10:14:11 -0700
  • Cc: Xen Devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <Keir.Fraser@xxxxxxxxxxxxx>, Gianni Tedesco <gianni.tedesco@xxxxxxxxxx>, Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
  • Delivery-date: Sat, 16 Oct 2010 10:15:17 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=gatNpA1x+Ly8MUyTBvQ4fVIgYic4vKs37ai6dEWwesZMNJianrRyEQpq1pzVjVJ/2a 0tEdPqWyzkdalh0PyRDnPdhi1Xr6jwj3qOywrxpAsshsE6vL9/nv0nfD8gSMvR+k57oD cDHJDMR9Zn+ozK5eWSO4egDECIohVQ8MCXgjQ=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Sat, Oct 16, 2010 at 9:29 AM, Sander Eikelenboom
<linux@xxxxxxxxxxxxxx> wrote:
> Hi Bruce,
>
> I tripped over the same warning trying to solve my freezes.
> Jan Beulich has posted a patch which is not in xen-unstable yet: [Xen-devel] 
> [PATCH] x86/msi: fix inverted masks in c/s 22182:68cc3c514a0a
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxxxx>
>
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -549,14 +549,14 @@ static u64 read_pci_mem_bar(u8 bus, u8 s
>         return 0;
>     if ( (addr & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == 
> PCI_BASE_ADDRESS_MEM_TYPE_64 )
>     {
> -        addr &= ~PCI_BASE_ADDRESS_MEM_MASK;
> +        addr &= PCI_BASE_ADDRESS_MEM_MASK;
>         if ( ++bir >= limit )
>             return 0;
>         return addr |
>                ((u64)pci_conf_read32(bus, slot, func,
>                                      PCI_BASE_ADDRESS_0 + bir * 4) << 32);
>     }
> -    return addr & ~PCI_BASE_ADDRESS_MEM_MASK;
> +    return addr & PCI_BASE_ADDRESS_MEM_MASK;
>  }
>
>  /**
>
>
>
> That fixes the warn, but my machine still keeps freezing non the less.
> (but it also does so with pci=nomsi so it's not msi specific in my case)
>
> --
>
> Sander

Hi Sander,

Thank you.  I tried it against 4.1.0-22240 with no effect.
I confirmed I had the right patch:

0 %> hg diff  xen/arch/x86/msi.c

diff -r 38ad3633ecaf xen/arch/x86/msi.c
--- a/xen/arch/x86/msi.c        Wed Oct 13 12:01:30 2010 +0100
+++ b/xen/arch/x86/msi.c        Sat Oct 16 10:12:31 2010 -0700
@@ -549,14 +549,14 @@
         return 0;
     if ( (addr & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
PCI_BASE_ADDRESS_MEM_TYPE_64 )
     {
-        addr &= ~PCI_BASE_ADDRESS_MEM_MASK;
+        addr &= PCI_BASE_ADDRESS_MEM_MASK;
         if ( ++bir >= limit )
             return 0;
         return addr |
                ((u64)pci_conf_read32(bus, slot, func,
                                      PCI_BASE_ADDRESS_0 + bir * 4) << 32);
     }
-    return addr & ~PCI_BASE_ADDRESS_MEM_MASK;
+    return addr & PCI_BASE_ADDRESS_MEM_MASK;
 }

 /**

The boot time msi warn messages were unchanged.

-Bruce

>
> Saturday, October 16, 2010, 6:14:17 PM, you wrote:
>
>> On Mon, Oct 11, 2010 at 2:05 PM, Bruce Edge <bruce.edge@xxxxxxxxx> wrote:
>>> On Mon, Oct 11, 2010 at 10:12 AM, Gianni Tedesco
>>> <gianni.tedesco@xxxxxxxxxx> wrote:
>>>> On Fri, 2010-10-08 at 10:33 +0100, Gianni Tedesco wrote:
>>>>> Hi,
>>>>>
>>>>> I've been trying to boot stefano's minimal dom0 kernel from
>>>>> git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
>>>>> 2.6.36-rc1-initial-domain-v2+pat
>>>>>
>>>>> On xen-unstable, I get the following WARN_ON()'s from Xen when bringing
>>>>> up the NIC's, then the machine hangs forever when trying to login either
>>>>> over serial or NIC.
>>>>>
>>>>> (XEN) Xen WARN at msi.c:649
>>>
>>> I get the same Xen WARN messages using the current pvops/xen-next with
>>> xen-unstable, here's the complete list for one boot, grep'd for WARN:
>>>
>>> (XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN) Xen WARN at msi.c:656
>>> (XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN) Xen WARN at msi.c:656
>>> (XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN) Xen WARN at msi.c:656
>>> (XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN)    0000000080287db8 0(XEN) Xen WARN at msi.c:636
>>> (XEN) Xen WARN at msi.c:649
>>> (XEN) Xen WARN at msi.c:656
>>>
>>> The complete boot seq is attached.
>>>
>>> I do get a login at the end of the boot seq though.
>>> My situation goes pear shaped when I try start a pv domU. The dom0
>>> locks up after printing this on the console:
>>>
>>> (XEN) tmem: all pools frozen for all domains
>>> (XEN) tmem: all pools thawed for all domains
>>> (XEN) tmem: all pools frozen for all domains
>>> (XEN) tmem: all pools thawed for all domains
>>> mapping kernel into physical memory
>>> about to get started...
>>>
>>> then prints these once a minute:
>>> [  589.490894] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
>>>
>>> The xen console is still active and I can generate a diag dump, also 
>>> attached.
>>>
>>> This dom0 lockup behavior started with pv-ops 2.6.32.21, all the way
>>> to .24, rendering the later pvops kernels unusable for dom0.
>>> The 2.6.32.18 kernel is the last one that functioned as a dom0.
>>>
>>> This behavior is consistent on platforms, HP proliant 380DL G6, and
>>> G7, as well as i7 supermicros.
>>>
>>> -Bruce
>>>
>>>>
>>>> Hmm so this appears not to be an issue with XCP kernel, in that case I
>>>> get the warnings but everything still works fine.
>>>>
>>>> I will investigate further when I have some time.
>>>>
>>>> Gianni
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
>>>>
>>>
>
>> The latest xen-unstable, 22240 has the same "  (XEN) Xen WARN at
>> msi.c:636 " messages with associated stack traces.
>
>> I spent a little more time working with this version, and except for
>> these disconcerting messages, which do look like they are initiated by
>> the ethernet card discovery, the system appears functional.
>> In all cases the first occurrence is immediately after the NIC discovery:
>
>>  e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
>> | e1000e: Copyright (c) 1999-2008 Intel Corporation.
>> | xen: registering gsi 16 triggering 0 polarity 1
>> | xen_allocate_pirq: returning irq 16 for gsi 16
>>   xen: --> irq=16
>>   Already setup the GSI :16
>>   e1000e 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
>>   e1000e 0000:06:00.0: setting latency timer to 64
>>     alloc irq_desc for 493 on node 0
>>     alloc kstat_irqs on node 0
>>   (XEN) Xen WARN at msi.c:636
>>   (XEN) ----[ Xen-4.1-unstable  x86_64  debug=y  Not tainted ]----
>> ....
>
>> In case it's a NIC specific issue, I'm seeing it with both
>>     06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit
>> Network Connection
>> and
>>     02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II
>> BCM5709 Gigabit Ethernet (rev 20)
>> NICs
>
>> -Bruce
>
>
>
>
>
> --
> Best regards,
>  Sander                            mailto:linux@xxxxxxxxxxxxxx
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.