[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-4.9-testing test] 126201: regressions - FAIL
On Fri, Aug 24, 2018 at 09:58:02AM +0100, Wei Liu wrote: > On Wed, Aug 22, 2018 at 04:52:27PM -0600, Jim Fehlig wrote: > > On 08/21/2018 05:14 AM, Jan Beulich wrote: > > > > > > On 21.08.18 at 03:11, <osstest-admin@xxxxxxxxxxxxxx> wrote: > > > > flight 126201 xen-4.9-testing real [real] > > > > http://logs.test-lab.xenproject.org/osstest/logs/126201/ > > > > > > > > Regressions :-( > > > > > > > > Tests which did not succeed and are blocking, > > > > including tests which could not be run: > > > > test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail > > > > REGR. vs. 124328 > > > > > > Something needs to be done about this, as this continued failure is > > > blocking the 4.9.3 release. I did mail about this on Aug 2nd already > > > for flight 125710, I've got back from Wei: > > > > > > > This is libvirtd's error message. > > > > > > > > The remote host can't obtain the state change log due to it is already > > > > held by another task/thread. It could be a libvirt / libxl bug. > > > > > > > > 2018-08-01 16:12:13.433+0000: 3491: warning : > > > > libxlDomainObjBeginJob:151 : > > > > Cannot start job (modify) for domain debian.guest.osstest; current job > > > > is (modify) owned by (24975) > > > > I took a closer look at the logs and it appears the finish phase of > > migration fails to acquire the domain job lock since it is already held by > > the perform phase. In the perform phase, after the vm has been transferred > > to the dst, the qemu process associated with the vm is started. For whatever > > reason that takes a long time on this host: > > > > 2018-08-19 17:05:19.182+0000: libxl: libxl_dm.c:2235:libxl__spawn_local_dm: > > Domain 1:Spawning device-model /usr/local/lib/xen/bin/qemu-system-i386 with > > arguments: ... > > 2018-08-19 17:05:19.188+0000: libxl: libxl_exec.c:398:spawn_watch_event: > > domain 1 device model: spawn watch p=(null) > > This is a spurious event after the watch has been set up. > > > ... > > 2018-08-19 17:05:51.529+0000: libxl: libxl_event.c:573:watchfd_callback: > > watch w=0x7f84a0047ee8 wpath=/local/domain/0/device-model/1/state token=2/1: > > event epath=/local/domain/0/device-model/1/state > > 2018-08-19 17:05:51.529+0000: libxl: libxl_exec.c:398:spawn_watch_event: > > domain 1 device model: spawn watch p=running > > So it has taken 32s for QEMU to write "running" in xenstore. This, > however, is still within the timeout limit set by libxl (60s). > I haven't been able to reliably reproduce the timeout. One thing I observe is that libvirt picks qdisk backend while xl picks phys backend. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |