[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] reboot driver domains with auto-reconnect?

On Thu, Apr 19, 2018 at 02:19:25PM +0000, Jason Cooper wrote:
> On Mon, Apr 16, 2018 at 03:58:59PM +0000, Jason Cooper wrote:
> > My current dilemma is that I would like to be able to reboot these mini
> > VMs for kernel updates, etc.  When this happens, I'd like the previously
> > connected VMs to reconnect to the new instance.
> Here's the script, to be executed on Dom0, that attempts to do the
> reboot of a driver domain, and then re-attach the previously attached
> vifs.  I'm only concerned about vifs for now.  vbds will come later.

v2 attached.

> The reboot portion sucks, I despise random sleeps.  I'd like to trigger
> on a xenbus variable to indicate the VM is ready to receive attach
> events.

still haven't tackled this yet.

> Also, the client VM to the driver-domain needs hotplug enabled, which
> makes sense.
> Occassionally, I'll get a stale vif in the client VM.  Usually this is
> when I was debugging something else, and just needed to reboot the
> driver-domain.  I'm hoping those won't occur once I nail down a proper
> reboot sequence.  I'll see if I can catch the 'timeout waiting for vif
> to detach' error message, which maybe related.

A couple of notes:

 - Upon calling 'xl reboot ...' or 'xl shutdown ...' the guest VM kernel
   attempts to execute /sbin/reboot or /sbin/poweroff.  If you did the
   same dumb thing I did, and put all the busybox symlinks in /bin/, you
   need to add symlinks for /sbin/{reboot,poweroff}.

 - during develop/test/debug-repeat cycles, I accidentally created extra
   vifs.  So instead of just v234.0 <-> eth0, I'd also have vif234.1 <->
   eth1.  It's *not* sufficient to do

        xenstore-rm /local/domain/234/device/vif/1

   You must also do

        xenstore-rm /local/domain/418/backend/vif/234/1

   Where 418 is the current driver domain and 234 is the guest.

 - For some reason 'xl shutdown -w DOM', while it does wait, doesn't
   wait long enough.  You'll see the sleep following the command because
   otherwise 'xl create ...' would error out.  This probably needs some
   more digging.

 - configuring hotplug in the guest is critical if you'd like the
   re-connected interface to be auto-magically setup as it was before

        echo '/bin/mdev' >/proc/sys/kernel/hotplug

   And edit /etc/mdev.conf for the network interfaces to trigger


if [ $# -ne 1 ]; then
        echo >&2 "Usage: ${0##*/} domain"
        exit 1


# get the domain id
DOMID="`xl domid $DOM`"
[[ "$DOMID" =~ (^[0-9]+$) ]] || exit 1


# loop through frontends
while read frontend <&4; do
        while read vif <&5; do
                if [ "x$vif" = "x" ]; then
                        # stale frontend
                        echo >&2 "WARN: stale frontend ($frontend), removing"
                        xenstore-rm /local/domain/$DOMID/backend/vif/$frontend

                # store info for afterwards
                front="`xl domname $frontend`"
                if [ "x$front" != "x" ] && [[ "$bridge" =~ (br[0-9][0-9]*) ]]; 
                        echo "$front bridge=$bridge backend=$DOM" >>"$tmp"

                        # remove the vif
                        echo >&2 "Removing $vif from $front"
                        xl -f network-detach $front $vif
        done 5< <(xenstore-list /local/domain/$DOMID/backend/vif/$frontend)
done 4< <(xenstore-list /local/domain/$DOMID/backend/vif)

# reboot the domain
xl shutdown -w $DOM || exit 2
sleep 1
xl create -c $DOM || exit 3

if [ "`cat $tmp | wc -c`" -eq 0 ]; then
        rm -f $tmp
        exit 0

# reattach everything
while read ln <&4; do
        echo >&2 "re-attach [$ln]"
        xl network-attach $ln || exit 4
done 4< <(cat $tmp)

rm -f $tmp

exit 0



Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.