[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] test report for Xen 4.3 RC1



> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> Sent: Wednesday, June 05, 2013 12:36 AM
> To: Ren, Yongjie
> Cc: george.dunlap@xxxxxxxxxxxxx; Xu, YongweiX; Liu, SongtaoX; Tian,
> Yongxue; xen-devel@xxxxxxxxxxxxx
> Subject: Re: [Xen-devel] test report for Xen 4.3 RC1
> 
> On Tue, Jun 04, 2013 at 03:59:33PM +0000, Ren, Yongjie wrote:
> > Sorry for replying late. :-)
> >
> > > -----Original Message-----
> > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> > > Sent: Tuesday, May 28, 2013 11:16 PM
> > > To: Ren, Yongjie; george.dunlap@xxxxxxxxxxxxx
> > > Cc: xen-devel@xxxxxxxxxxxxx; Xu, YongweiX; Liu, SongtaoX; Tian,
> Yongxue
> > > Subject: Re: [Xen-devel] test report for Xen 4.3 RC1
> > >
> > > On Mon, May 27, 2013 at 03:49:27AM +0000, Ren, Yongjie wrote:
> > > > Hi All,
> > > > This is a report based on our testing for Xen 4.3.0 RC1 on Intel
> platforms.
> > > > (Sorry it's a little late. :-)  If the status changes, I'll have an 
> > > > update
> > > later.)
> > >
> > > OK, I've some updates and ideas that can help with narrowing some of
> > > these
> > > issues down. Thank you for doing this.
> > >
> > > >
> > > > Test environment:
> > > > Xen: Xen 4.3 RC1 with qemu-upstream-unstable.git
> > > > Dom0: Linux kernel 3.9.3
> > >
> > > Could you please test v3.10-rc3. There have been some changes
> > > for the VCPU hotplug added in v3.10 that I am not sure whether
> > > they are in v3.9?
> > I didn't try every bug with v3.10.-rc3, but most of them still exist.
> >
> > > > Hardware: Intel Sandy Bridge, Ivy Bridge, Haswell systems
> > > >
> > > > Below are the features we tested.
> > > > - PV and HVM guest booting (HVM: Ubuntu, Fedora, RHEL, Windows)
> > > > - Save/Restore and live migration
> > > > - PCI device assignment and SR-IOV
> > > > - power management: C-state/P-state, Dom0 S3, HVM S3
> > > > - AVX and XSAVE instruction set
> > > > - MCE
> > > > - CPU online/offline for Dom0
> > > > - vCPU hot-plug
> > > > - Nested Virtualization  (Please look at my report in the following
> link.)
> > > >
> http://lists.xen.org/archives/html/xen-devel/2013-05/msg01145.html
> > > >
> > > > New bugs (4): (some of which are not regressions)
> > > > 1. sometimes failed to online cpu in Dom0
> > > >
> > >
> http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1851
> > >
> > > That looks like you are hitting the udev race.
> > >
> > > Could you verify that these patches:
> > > https://lkml.org/lkml/2013/5/13/520
> > >
> > > fix the issue (They are destined for v3.11)
> > >
> > Not tried yet. I'll update it to you later.
> 
> Thanks!
> >
We tested kernel 3.9.3 with the 2 patches you mentioned, and found this
bug still exist. For example, we did CPU online-offline for Dom0 for 100 times,
and found 2 times (of 100 times) failed.

> > > > 2. dom0 call trace when running sriov hvm guest with igbvf
> > > >
> > >
> http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1852
> > > >   -- a regression in Linux kernel (Dom0).
> > >
> > > Hm, the call-trace you refer too:
> > >
> > > [   68.404440] Already setup the GSI :37
> > >
> > > [   68.405105] igb 0000:04:00.0: Enabling SR-IOV VFs using the
> module
> > > parameter is deprecated - please use the pci sysfs interface.
> > >
> > > [   68.506230] ------------[ cut here ]------------
> > >
> > > [   68.506265] WARNING: at
> > >
> /home/www/builds_xen_unstable/xen-src-27009-20130509/linux-2.6-pvop
> > > s.git/fs/sysfs/dir.c:536 sysfs_add_one+0xcc/0xf0()
> > >
> > > [   68.506279] Hardware name: S2600CP
> > >
> > > is a deprecated warning. Did you follow the 'pci sysfs' interface way?
> > >
> > > Looking at da36b64736cf2552e7fb5109c0255d4af804f5e7
> > >     ixgbe: Implement PCI SR-IOV sysfs callback operation
> > > it says it is using this:
> > >
> > > commit 1789382a72a537447d65ea4131d8bcc1ad85ce7b
> > > Author: Donald Dutile <ddutile@xxxxxxxxxx>
> > > Date:   Mon Nov 5 15:20:36 2012 -0500
> > >
> > >     PCI: SRIOV control and status via sysfs
> > >
> > >     Provide files under sysfs to determine the maximum number of
> VFs
> > >     an SR-IOV-capable PCIe device supports, and methods to enable
> and
> > >     disable the VFs on a per-device basis.
> > >
> > >     Currently, VF enablement by SR-IOV-capable PCIe devices is done
> > >     via driver-specific module parameters.  If not setup in modprobe
> > > files,
> > >     it requires admin to unload & reload PF drivers with number of
> desired
> > >     VFs to enable.  Additionally, the enablement is system wide: all
> > >     devices controlled by the same driver have the same number of
> VFs
> > >     enabled.  Although the latter is probably desired, there are PCI
> > >     configurations setup by system BIOS that may not enable that to
> > > occur.
> > >
> > >     Two files are created for the PF of PCIe devices with SR-IOV
> support:
> > >
> > >         sriov_totalvfs  Contains the maximum number of VFs the
> device
> > >                         could support as reported by the
> TotalVFs
> > > register
> > >                         in the SR-IOV extended capability.
> > >
> > >         sriov_numvfs    Contains the number of VFs currently
> enabled
> > > on
> > >                         this device as reported by the NumVFs
> > > register in
> > >                         the SR-IOV extended capability.
> > >
> > >                         Writing zero to this file disables all VFs.
> > >
> > >                         Writing a positive number to this file
> enables
> > > that
> > >                         number of VFs.
> > >
> > >     These files are readable for all SR-IOV PF devices.  Writes to the
> > >     sriov_numvfs file are effective only if a driver that supports the
> > >     sriov_configure() method is attached.
> > >
> > >     Signed-off-by: Donald Dutile <ddutile@xxxxxxxxxx>
> > >
> > >
> > > Can you try that please?
> > >
> > Recently, one of my workmates already had a fix as below.
> > https://lkml.org/lkml/2013/5/30/20
> > And, seems also already been fixed by another guy.
> > https://patchwork.kernel.org/patch/2613481/
> >
> 
> Great! Care to update the bug with said relevant information?
Yes, updated in bugzilla.

> > >
> > > > 3. Booting multiple guests will lead Dom0 call trace
> > > >
> > >
> http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1853
> > >
> > > That one worries me. Did you do a git bisect to figure out what
> > > is commit is causing this?
> > >
> > I only found this bug on some Intel ~EX server.
> > I don't know which version on Xen/Dom0 can work fine.
> > If anyone want to reproduce or debug it, it should be good.
> > And our team is trying to debug it internally first.
> 
> Ah, OK. Then please continue on debugging it. Thanks!
> >
> > > > 4. After live migration, guest console continuously prints
> "Clocksource
> > > tsc unstable"
> > > >
> > >
> http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1854
> > >
> > > This looks like a current bug with QEMU unstable missing a ACPI table?
> > >
> > > Did you try booting the guest with the old QEMU?
> > >
> > > device_model_version = 'qemu-xen-traditional'
> > >
> > This issue still exists with traditional qemu-xen.
> > After more testing, this bug can't reproduced by some other guests.
> > RHEL6.4 guest will have this issue after live migration, while RHEL6.3 &
> > Fedora 17 & Ubuntu 12.10 guests can work fine.
> 
> There is a recent thread on this where the culprit was the PV timeclock
> not being updated correctly. But that would seem to be at odds with
> your reporting - where you are using Fedora 17 and it works fine.
> 
> Hm, I am at loss on this one.
>
Hm, but my test result is as I described.

> >
> > > >
> > > > Old bugs: (11)
> > > > 1. [ACPI] Dom0 can't resume from S3 sleep
> > > >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1707
> > >
> > > That should be fixed in v3.11 (as now we have the fixes)
> > > Could you try v3.10 with the Rafael's ACPI tree merged in?
> > > (so the patches that he wants to submit for v3.11)
> > >
> > I re-tested with Rafel's linux-pm.git tree (master and acpi-hotplug
> branch),
> > and found Dom0 S3 sleep/resume can't work, either.
> 
> The patches he has to submit for v3.11 are in the linux-next branch.
> You need to use that branch.
> 
Dom0 S3 sleep/resume doesn't work with linux-next branch, either.
attached the log.

> >
> > > > 2. [XL]"xl vcpu-set" causes dom0 crash or panic
> > > >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1730
> > >
> > > That I think is fixed in v3.10. Could you please check v3.10-rc3?
> > >
> > Still exists on v3.10-rc3.
> > The following command lines can reproduce it:
> > # xl vcpu-set 0 1
> > # xl vcpu-set 0 20
> 
> Ugh, same exact stack trace? And can you attach the full dmesg or serial
> output (so that Ican see what there is at bootup)
>
Yes, the same. Also attached in this mail.

> >
> > > > 3. Sometimes Xen panic on ia32pae Sandybridge when restore guest
> > > >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1747
> > >
> > > That looks to be with v2.6.32. Is the issue present with v3.9
> > > or v3.10-rc3?
> > >
> > We didn't test ia32pae Xen for a long time.
> > Now, we only cover ia32e Xen/Dom0.
> > So, this bug is only a legacy issue.
> > If we have effort to verify it, we'll update it in the bugzilla.
> 
> How about just dropping that bug as 'WONTFIX'.
> 
Agree. I'll close it as "WONTFIX".

> >
> > > > 4. 'xl vcpu-set' can't decrease the vCPU number of a HVM guest
> > > >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1822
> > >
> > > That I believe was an QEMU bug:
> > > http://lists.xen.org/archives/html/xen-devel/2013-05/msg01054.html
> > >
> > > which should be in QEMU traditional now (05-21 was when it went
> > > in the tree)
> > >
> > In this year or past year, this bug always exists (at least in our testing).
> > 'xl vcpu-set' can't decrease the vCPU number of a HVM guest
> 
> Could you retry with Xen 4.3 please?
>
With Xen 4.3 & Linux:3.10.0-rc3, I can't decrease the vCPU number of a guest.

Attachment: dom0-s3.log
Description: dom0-s3.log

Attachment: xl-vcpu-set-dom0-trace.log
Description: xl-vcpu-set-dom0-trace.log

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.