[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable: problems with qmp socket error in libxl_pci teardown path for HVM guest with PCI passthrough



On Fri, Nov 28, 2014 at 04:54:19PM +0000, Stefano Stabellini wrote:
> On Fri, 28 Nov 2014, Wei Liu wrote:
> > On Fri, Nov 28, 2014 at 03:08:51PM +0000, Stefano Stabellini wrote:
> > > create ^
> > > title it QMP connection problems prevent libxl from calling 
> > > libxl__device_pci_reset on domain shutdown
> > > thanks
> > > 
> > > On Wed, 26 Nov 2014, Sander Eikelenboom wrote:
> > > > Hi,
> > > > 
> > > > While testing a patch for Konrad i was wondering why "libxl_pci.c: 
> > > > libxl__device_pci_reset()"
> > > > doesn't get called on guest shutdown of a HVM guest (qemu-xen) with pci 
> > > > passthrough.
> > > > xl didn't show any problems on the commandline so i never drawed much 
> > > > attention 
> > > > to it, but /var/log/xen/xl-guest.log shows:
> > > > 
> > > >     Waiting for domain xbmc13sid (domid 19) to die [pid 20450]
> > > >     Domain 19 has shut down, reason code 0 0x0
> > > >     Action for shutdown reason code 0 is destroy
> > > >     Domain 19 needs to be cleaned up: destroying the domain
> > > >     libxl: error: libxl_qmp.c:443:qmp_next: Socket read error: 
> > > > Connection reset by peer
> > > >     libxl: error: libxl_qmp.c:701:libxl__qmp_initialize: Failed to 
> > > > connect to QMP
> > > >     libxl: error: libxl_pci.c:1242:do_pci_remove: libxl__qmp_pci_del: 
> > > > Connection reset by peer
> > > >     libxl: error: libxl_dm.c:1575:kill_device_model: Device Model 
> > > > already exited
> > > >     Done. Exiting now
> > > > 
> > > > So it doesn't even get to calling  "libxl_pci.c: 
> > > > libxl__device_pci_reset()".
> > > > 
> > > > --
> > 
> > It's worth checking whether the device model exits too early, i.e., did
> > it crash? What's in the DM log?
> 
> Firstly I was thinking that if force=1 it makes sense to continue even
> when QEMU returns error, see:
> 
> http://marc.info/?i=alpine.DEB.2.02.1411281648500.14135%40kaball.uk.xensource.com

In normal case DM is not destroyed until PCI device is removed. So I
think the real issue is DM crashed.

libxl.c:
1571     if (libxl__device_pci_destroy_all(gc, domid) < 0)
1572         LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "pci shutdown failed for domid 
%d", domid);
1573     rc = xc_domain_pause(ctx->xch, domid);
1574     if (rc < 0) {
1575         LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_pause 
failed for %d", domid);
1576     }
1577     if (dm_present) {
1578         if (libxl__destroy_device_model(gc, domid) < 0)
1579             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "libxl__destroy_device_model 
failed for %d", domid);
1580
1581         libxl__qmp_cleanup(gc, domid);
1582     }

The patch you posted covers up the real issue.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.