[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: [PATCH] Fix xen hang on intel westmere-EP



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> Sent: Wednesday, August 24, 2011 3:28 PM
> >>> On 24.08.11 at 04:16, "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> >> Sent: Tuesday, August 23, 2011 4:02 PM
> >> >> Also, are you certain this problem exists only with this single
> >> >> ICH
> > variant?
> >> > So far, we only observed it on ICH10 based chipset. Did you see
> >> > another platform have the problems?
> >>
> >> No, I haven't observed it on other platforms but
> >> - the problematic bits exist in earlier ICH versions too, and
> >> - there are ICH10 based systems that aren't showing any such bad
> >>   behavior.
> >
> > It depends on the BIOS. Some already disable the legacy USB emulation
> > and obviously you cannot see this problem on those platforms.
> 
> With the help of the debug printk()-s in the patch I was able to see that on 
> at
> least one unaffected platform, at least one of the USB emulation bits was set
> nevertheless.
> 
> >>
> >> The keys I'm currently using (successfully tested on customer's and
> >> my
> > affected
> >> machines) are BIOS and board vendor being "Intel". Only on those I'm
> >> then looking for the ICH10 (and then all known variants). See below
> >> for the patch (still containing some debugging bits).
> >>
> >> > Actually, the best way to solve it is to enable the ACPI mode in
> >> > Xen instead of in dom0. For enable ACPI, we need to write the value
> >> > from FADT.ACPI_ENABLE to SMI_CMD. After writing the value, the SMI
> >> > ownership will be disable by ACPI hardware and it also will disable
> >> > some
> > logic
> >> which is able to cause SMI.
> >> > For example, the legacy USB circuit will be masked too. Because at
> >> > this point, there have no need to use legacy usb emulation. This is
> >> > also what linux upstream did. But I think it is too complicated to
> >> > port this logic to xen. Anyway, if you have interesting, you can
> >> > add this logic to xen and there have no need for this patch again.
> >>
> >> Was this a pretty recent change in Linux? Otherwise, how do you
> >> explain
> > that,
> >> with apparently lower probability, I'm observing the very same hang
> >> with
> > native
> >> Linux? My assumption is that ACPI mode enabling is happening just too
> >> late
> > for
> >> masking this problem.
> >
> > Our bios team and me also had a try with native linux. But we cannot
> > reproduce it. So this maybe another potential issue which different
> > from what we are discussing now.
> 
> Pretty unlikely. Remember that for months you (as a company) claimed not to
> be able to reproduce the issue on Xen. As it's even more sporadic on native
> Linux, I'm not really surprised there's a problem reproducing it there.

To identify whether native linux have the same problem, you can disable the SMI 
to see whether the hang happen again. 

> >> --- a/xen/arch/x86/dmi_scan.c
> >> +++ b/xen/arch/x86/dmi_scan.c
> >> @@ -10,6 +10,8 @@
> >>  #include <asm/system.h>
> >>  #include <xen/dmi.h>
> >>  #include <xen/efi.h>
> >> +#include <xen/pci.h>
> >> +#include <xen/pci_regs.h>
> >>
> >>  #define bt_ioremap(b,l)  ((void *)__acpi_map_table(b,l))  #define
> >> bt_iounmap(b,l)  ((void)0) @@ -278,6 +280,31 @@ static __init int
> >> broken_toshiba_keyboar
> >>    return 0;
> >>  }
> >>
> >> +static int __init ich10_bios_quirk(struct dmi_system_id *d) {
> >> +    u32 port, smictl;
> >> +
> >> +    if ( pci_conf_read16(0, 0x1f, 0, PCI_VENDOR_ID) != 0x8086 )
> >> +        return 0;
> >> +
> >> +    switch ( pci_conf_read16(0, 0x1f, 0, PCI_DEVICE_ID) ) {
> >> +    case 0x3a14:
> >> +    case 0x3a16:
> >> +    case 0x3a18:
> >> +    case 0x3a1a:
> >> +printk("ACPI base=%04x\n", port = pci_conf_read16(0, 0x1f, 0,
> 0x40));//temp
> >> +        port = (port & 0xff80) + 0x30;
> >> +        smictl = inl(port);
> >> +printk("smictl=%08x\n", smictl);//temp
> >> +        /* turn off LEGACY_USB{,2}_EN if enabled */
> >> +        if ( smictl & 0x20008 )
> >> +            outl(smictl & ~0x20008, port); printk("smictl:%08x\n",
> >> +inl(port));//temp
> >> +        break;
> >> +    }
> >> +
> >> +    return 0;
> >> +}
> >>
> >>  #ifdef CONFIG_ACPI_SLEEP
> >>  static __init int reset_videomode_after_s3(struct dmi_blacklist *d)
> >> @@
> >> -342,6 +369,18 @@ static __initdata struct dmi_blacklist d
> >>                    } },
> >>  #endif
> >>
> >> +  { ich10_bios_quirk, "Intel board & BIOS",
> >> +          /*
> >> +           * BIOS leaves legacy USB emulation enabled while
> >> +           * SMM can't properly handle it.
> >> +           */
> >> +          {
> >> +                  MATCH(DMI_BOARD_VENDOR, "Intel Corp"),
> >> +                  MATCH(DMI_BIOS_VENDOR, "Intel Corp"),
> >> +                  NO_MATCH, NO_MATCH
> >> +          }
> >> +  },
> >> +
> >>  #ifdef    CONFIG_ACPI_BOOT
> >>    /*
> >>     * If your system is blacklisted here, but you find that acpi=force
> >
> > This patch is better. But do we really need to disable it on other
> > ICH? If you do seen this issue with other ICH, then you can do it.
> > Otherwise, there may cause some potential issue. Have you test your patch
> on those platform?
> 
> Obviously not, given the very limited set of systems on which we observe the
> problem. But I really just listed the various ICH10 PCI IDs
> - are you saying that the problematic BIOSes is reasonably certain to have got
> used with only one of them? After all, not affecting other platforms is what 
> is
> being aimed at with the DMI match strings.

I am not sure whether other BIOS have the same issue. But it's ok to turn off 
it even in the good platform.

best regards
yang

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.