[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] XENBUS: Timeout connecting to device errors


  • To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Graham, Simon" <Simon.Graham@xxxxxxxxxxx>
  • Date: Mon, 4 Dec 2006 14:18:37 -0500
  • Delivery-date: Mon, 04 Dec 2006 11:18:58 -0800
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AccX2P+RAWqsSQbWQWeVn+J9Wy6xLw==
  • Thread-topic: XENBUS: Timeout connecting to device errors

We've been noticing a lot of these errors when booting VMs since we
moved to 3.0.3 - I've traced this to the hotplug scripts in Dom0 taking
>10s to run to completion and specifically the vif-bridge script taking
>=9s to plug the vif into the s/w bridge on occasion - was wondering if
anyone has any insight into why it might take this long.

I added some instrumentation to the scripts to log entry/exit from
xen-backend.agent and also lock contention (attached at the end of this)
and have the following observations:

1. Currently, the various script invocations are issued in parallel but
are serialized
   by a single global lock -- is it really necessary, for example, to
serialize vif
   and vbd hot plug processing in Dom0?

2. In most cases we've seen, this problem happens when the first VM is
started after
   re-installing a box. In the example below, the 'vif online'
processing started at
   2:21:53 and did not finish until 2:22:04

3. Clearly a hard coded timeout of 10s is less than perfect -- is there
no better way of knowing
   when the hotplug processing is done?

Thanks,
Simon

<dom0 /var/log/messages:>

Dec  4 02:21:53 gromit xen-hotplug: /etc/hotplug/xen-backend.agent:
xen-backend[20234]: Start vif: add
Dec  4 02:21:53 gromit xen-hotplug: /etc/hotplug/xen-backend.agent:
xen-backend[20234]: End vif: add
Dec  4 02:21:53 gromit xen-hotplug: /etc/hotplug/xen-backend.agent:
xen-backend[20240]: Start vif: online
Dec  4 02:21:53 gromit xen-hotplug: /etc/hotplug/xen-backend.agent:
xen-backend[20252]: Start vbd: add
Dec  4 02:21:53 gromit xen-hotplug: /etc/hotplug/xen-backend.agent: Lock
/var/run/xen-hotplug/xenbus_hotplug_global by 20252 - currently owned by
20240: /etc/hotplug/xen-backend.agent
Dec  4 02:21:54 gromit lvm[12123]: XenDom wallace1: state changed
stopped => paused
Dec  4 02:21:54 gromit sn2spine: start RESULT <?xml version="1.0" ?>
<result status='ok' code='200'> <guest
id="3f879d14-8c70-48af-ae02-88df3afad3cb"><name>wallace1</name><id>3f879
d14-8c70-48af-ae02-88df3afad3cb</id><system>gromit.sn.stratus.com</syste
m><state>starting</state><availability>failover</availability><mode>dupl
ex</mode><memory>256</memory><cpus>1</cpus><storage><volume
device="hda1" mountpoint="/" name="drbd0"/></storage></guest> </result>

Dec  4 02:21:54 gromit xen-hotplug: /etc/xen/scripts/vif-bridge: online
XENBUS_PATH=backend/vif/1/0
Dec  4 02:21:54 gromit kernel: device vif1.0 entered promiscuous mode
Dec  4 02:21:54 gromit xen-hotplug: /etc/xen/scripts/vif-bridge:
iptables -A FORWARD -m physdev --physdev-in vif1.0  -j ACCEPT failed. If
you are using iptables, this may affect networking for guest domains.
Dec  4 02:21:55 gromit kernel: xenbr0: port 3(vif1.0) entering learning
state
Dec  4 02:21:59 gromit kernel: xenbr0: topology change detected,
propagating
Dec  4 02:22:03 gromit kernel: xenbr0: port 3(vif1.0) entering
forwarding state
Dec  4 02:22:04 gromit lvm[12123]: XenDom wallace1: state changed paused
=> running
Dec  4 02:22:04 gromit xen-hotplug: /etc/hotplug/xen-backend.agent:
xen-backend[20240]: End vif: online
Dec  4 02:22:18 gromit kernel: ip_tables: (C) 2000-2006 Netfilter Core
Team
Dec  4 02:22:22 gromit xen-hotplug: /etc/xen/scripts/block: add
XENBUS_PATH=backend/vbd/1/769
Dec  4 02:22:26 gromit xen-hotplug: /etc/hotplug/xen-backend.agent:
xen-backend[20252]: End vbd: add
Dec  4 02:22:29 gromit lvm[12123]: XenDom wallace1: state changed
running => crashed


<guest console>:
Dec  4 02:21:54 Linux version 2.6.16.29-xenU
(sntriage@xxxxxxxxxxxxxxxxxxx) (gcc version 3.4.4 20050721 (Red Hat
3.4.4-2)) #1 SMP Mon Dec 4 01:33:25 EST 2006
...
Dec  4 02:22:04 XENBUS: Timeout connecting to device: device/vbd/769
(state 3)
Dec  4 02:22:04 Root-NFS: No NFS server available, giving up.
Dec  4 02:22:04 VFS: Unable to mount root fs via NFS, trying floppy.
Dec  4 02:22:04 VFS: Cannot open root device "hda1" or
unknown-block(2,0)
Dec  4 02:22:04 Please append a correct "root=" boot option
Dec  4 02:22:04 Kernel panic - not syncing: VFS: Unable to mount root fs
on unknown-block(2,0)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.