[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Dom0 crashed when rebooting whilst DomU are running

On Tue, 2012-09-11 at 23:46 +0100, Maik Brauer wrote:
> >>>> I found out that it hangs during re-boot of dom0 when having more
> >>>> Network interfaces involved, like:
> >>>>     vif = [ 'mac=06:46:AB:CC:11:01, ip=<myIPadress>', '', '',
> >>>> 'mac=06:04:AB:BB:11:03, bridge=VLAN20, script=vif-bridge', '',
> >>>> 'mac=06:04:AB:BB:11:05, bridge=VLAN40, script=vif-bridge' ]
> >>> 
> >>> 6 interfaces total, 3 of which have a random mac on each reboot and all
> >>> get put on the default bridge?
> >> 
> >> No, not really. The bridge is different for each interface.
> > 
> > You have three lots of '' which will all go onto the same bridge AFAICT
> > (whichever one is determined to be the default)
> That is right. As long as I put nothing inside that it should be a
> different script to execute, it will use default for ''

The default is "vif-bridge". Have you changed the default?

If not then your configuration as shown will put three interfaces on the
*same* bridge. Is this really what you want?

You claim above that the bridge is different for each interface, but
unless you have changed something somewhere then this is not the case.
Since you are having problems it is important to identify everything
which you have changed from the defaults.

> >> List is empty. SysRQ -w and SysRQ-t shows nothing at all.
> > 
> > You might need to increase the log verbosity with SysRQ-9 first?
> I did and now I got more Information. But due to the amount of data which 
> slips over the console screen I am not able
> to record properly. Can you advice what to do here?

Like I said "that list can be quite long so it is useless
without a serial console": http://wiki.xen.org/wiki/Xen_Serial_Console

Depending on your distro you might also find this info in the logs
under /var/log somewhere.

> > 
> >> There is nothing running anymore.
> >> It shows periodically:  INFO: task xenwatch:12 blocked for more than 120 
> >> seconds
> > 
> > What is the very last thing printed before this?
> There is nothing before.

So the output is silent from boot until this message comes up? That
seems unlikely, since there should be plenty of messages from the
shutdown process itself if nothing else.

What is the last message one the screen before this one? In fact what is
the entire last screenfull of output?

> > Really the initscript ought to wait, the default at least with the
> > script shipped with xen is to do so, by using shutdown --wait. can you
> > confirm whether or not this is happening for you?
> At least I can see that the shutdown --wait is in the scripts. So it seems 
> that the init script is waiting.
> But independent from that, something must be still in use. Which block the 
> reboot process.
> > 
> > Possibly someone is trying to talk to xenstore after xenstored has
> > exited -- I expect that would cause the sorts of blocked for 120
> > messages you are seeing.
> > 
> Could be, but we need to find out what is blocking the shutdown. I do not 
> know what else I can do in order to measure and collect
> data for investigation.

Did you add debugging to the hotplug scripts like I suggested a couple
of mails back?

If you run the xendomains script by hand and then *immediately* after it
exits run "xl list" have the domains actually gone? You could even stick
some calls to xl list into the script itself and verify that the domains
are indeed shutting down as expected.

BTW Are you using xl or xend?


Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.