[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Detecting deadlocks with hypervisor..
From: Ewan Mellor <ewan@xxxxxxxxxxxxx> To: Thileepan Subramaniam <thileepan_@xxxxxxxxxxx> CC: xen-devel@xxxxxxxxxxxxxxxxxxx Subject: Re: [Xen-devel] Detecting deadlocks with hypervisor.. Date: Sun, 19 Mar 2006 13:17:35 +0000 On Sat, Mar 18, 2006 at 06:14:09PM -0800, Thileepan Subramaniam wrote: > Hello, >> I am trying to see if the hypervisor can be used to detect deadlocks in the > guest VMs. My goal is to detect if a guest OS is deadlocked, and if it is, > then create a clone of the deadlocked OS without the locking condition, and> letting the clone run. While the clone runs I am hoping to generate some > hints that could tell me what caused the deadlock. > > I simulated a deadlock/hang situation in a guest OS (by loading a badly > written module to the kernel) and when the guestOS kernel was hanging, I > ran "xm save" from Dom-0. But this command waits forever. >> I tried to follow the flow of the .py files (XendCheckpoint.py etc.). These> seem to be called when I run 'xm save'. But beyond a point I am not sure > what the python scripts do. I also see some libxc files such as > xc_linux_save.c, but I am not sure who is using it (Dom-0 or Xen or the> XenU). Can someone help me by explaining me what happens behind the scene > when "xm save" is called ? Is there any good documentation explaining which> actions are done by which layers (eg: python layer, C layer etc).xc_save, the executable, calls xc_linux_save, the libxc function. Dependingupon whether this is a live or non-live save, some stuff is done (seexc_linux_save for details). The Python layer is then called back, requesting that the domain is suspended. This request is passed through to the guest by writing /local/domain/<domid>/control/shutdown = suspend in the store. Thisis seen by the guest (a watch fires inside reboot.c) and then the guestsuspends itself. This is probably where you are falling down -- if the guestkernel is completely deadlocked, it's going to struggle to suspend itself correctly. This may sound a silly question (pardon me because i am relatively new to linux kernel) .. will it be possible to continue running reboot.c (or for that matter any kernel thread) when the kernel is deadlocked ? In Linux, is the kernel a single process or a bunch of parallelly executing entities? If later, then during a kernel deadlock (eg: by loading a faulty module that disables interrupts and do something silly) there can still be some other processes/threads run, right? thanks TS If a suspend completes correctly, Xend will see it (another watch will fire),and xc_linux_save will be free to complete the save.> Also, does it seem viable to clone a copy of a deadlocked guest OS in the> first place? If you have a byte-for-byte copy of a deadlocked guest, even if you could suspend it, surely it will be deadlocked when it is resumed. How do you intend to break the deadlock, and how is it easier to do that from outside than it is to perform deadlock detection in the guest? Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel _________________________________________________________________Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |