[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] Re: [Xen-devel] Dom0 reboot when several VM reboot at the same time


  • To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
  • From: tsk <aixt2006@xxxxxxxxx>
  • Date: Thu, 1 Jul 2010 20:53:39 +0800
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>, Daniel Stodden <daniel.stodden@xxxxxxxxxx>
  • Delivery-date: Thu, 01 Jul 2010 05:55:30 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Di+d46PlU4l8iLP7x1g+Z+JcCR5Y99WdC2tOeUMnmMfVAQTK/SrmQbmj6PXirc59zy 0vwsFyx5nLPZHb1El6zNCdQY0HWWXbEmrjr5Ad9Tig2avyx8upwXGnFCBC5SboE8OKjO P+bieuNVYeFSWRW0lCoAYXkKOe4VSc2ilE7oo=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

The brittleness of tapdisk2 process is a very serious problem, Daniel Stodden said that patches will send soon. 
I hope Xen-4.1 will drive these problems away. But when will it be released?
I need it imperativily or I will make some patches.

tsk

在 2010年7月1日 下午6:30,Jeremy Fitzhardinge <jeremy@xxxxxxxx>写道:
On 07/01/2010 11:36 AM, tsk wrote:
> I modified the XendDomainInfo.py, add a lock in the _restart
> function, and sleep 20s after dom destroy.
>
> 6 VMs reboot almost 100 times, Dom0 and VMs are all OK.
>
> I guess it should be a problem of blktap2.

Yes, that's what I was suspecting. It seems to have had a series of
problems with shutdown, and a brittleness where the death of the tapdisk
process can bring down the system.

J

>
> Testcase have not be ran in 2.6.32.x yet.
>
>
> tsk
>
> 2010/6/29 Jeremy Fitzhardinge <jeremy@xxxxxxxx <mailto:jeremy@xxxxxxxx>>
>
>     On 06/28/2010 05:44 AM, tsk wrote:
>     > xm info:
>     >
>     > release : 2.6.31.13
>
>     Can you reproduce this with a xen/stable-2.6.32.x - based kernel?
>
>     J
>
>     > version : #3 SMP Fri Apr 30 15:10:24 CST 2010
>     > machine : x86_64
>     > nr_cpus : 16
>     > nr_nodes : 2
>     > cores_per_socket : 4
>     > threads_per_core : 2
>     > cpu_mhz : 2266
>     > hw_caps :
>     >
>     bfebfbff:28100800:00000000:00001b40:009ce3bd:00000000:00000001:00000000
>     > virt_caps : hvm
>     > total_memory : 24544
>     > free_memory : 19693
>     > node_to_cpu : node0:0,2,4,6,8,10,12,14
>     > node1:1,3,5,7,9,11,13,15
>     > node_to_memory : node0:7633
>     > node1:12059
>     > node_to_dma32_mem : node0:2996
>     > node1:0
>     > max_node_id : 1
>     > xen_major : 4
>     > xen_minor : 0
>     > xen_extra : .0
>     > xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
>     > hvm-3.0-x86_32p hvm-3.0-x86_64
>     > xen_scheduler : credit
>     > xen_pagesize : 4096
>     > platform_params : virt_start=0xffff800000000000
>     > xen_changeset : unavailable
>     > xen_commandline : console=com1,vga com1=115200,8n1 msi=1
>     > dom0_mem=6144M dom0_max_vcpus=4 dom0_vcpus_pin iommu=off x2apic=off
>     > hap=0
>     > cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)
>     > cc_compile_by : root
>     > cc_compile_date : Wed May 12 19:09:47 CST 2010
>     > xend_config_format : 4
>     >
>     >
>     >
>     > tsk
>     >
>     > 2010/6/26 Jeremy Fitzhardinge <jeremy@xxxxxxxx
>     <mailto:jeremy@xxxxxxxx> <mailto:jeremy@xxxxxxxx
>     <mailto:jeremy@xxxxxxxx>>>
>     >
>     > On 06/25/2010 04:40 AM, tsk wrote:
>     > > Hi folks,
>     > >
>     > > I met a problem: when 6 VM reboot at the same time, 3:00 morning,
>     > > the Dom0 reboot it self:
>     > > Xen version is 4.0.0, VMs are windows 2003 with redhat pv,
>     they will
>     > > update and reboot itself every 3:00 AM.
>     > >
>     > > # last
>     > > ... ...
>     > > admin pts/0 10.247.1.1 Fri Jun 25 03:40 - 04:30
>     > > (00:50)
>     > > reboot system boot 2.6.31.13 Fri Jun 25 11:16
>     > > (00:-3)
>     > > admin pts/0 10.247.1.1 Fri Jun 18 03:30 - 03:31
>     > (00:01)
>     > > ... ...
>     > >
>     > > /var/log/xen/xend.log:
>     > > ... ...
>     > > [2010-06-25 03:10:41 4409] DEBUG (DevController:139) Waiting for
>     > > devices vif2.
>     > > [2010-06-25 03:10:41 4409] DEBUG (DevController:139) Waiting for
>     > > devices vif.
>     > > [2010-06-25 03:10:41 4409] DEBUG (DevController:144) Waiting
>     for 0.
>     > > [2010-06-25 03:10:41 4409] INFO (XendDomainInfo:2150) Domain has
>     > > shutdown: name=VM-4836078C.1515.21 id=8 reason=reboot.
>     > > [2010-06-25 03:10:41 4409] DEBUG (XendDomainInfo:3115)
>     > > XendDomainInfo.destroy: domid=8
>     > > [2010-06-25 03:10:41 4409] INFO (XendDomainInfo:2150) Domain has
>     > > shutdown: name=VM-4836078C.1515.21 id=8 reason=reboot.
>     > > [2010-06-25 03:10:41 4409] DEBUG (XendDomainInfo:1953)
>     > > XendDomainInfo.handleShutdownWatch
>     > > [2010-06-25 03:10:41 4409] DEBUG (DevController:628)
>     > > hotplugStatusCallback
>     > /local/domain/0/backend/vif/26/0/hotplug-status.
>     > > [2010-06-25 03:13:36 4401] INFO (SrvDaemon:332) Xend Daemon
>     started
>     > > [2010-06-25 03:13:36 4401] INFO (SrvDaemon:336) Xend changeset:
>     > > unavailable.
>     > > ... ...
>     > >
>     > >
>     > >
>     > > /var/log/messages:
>     > > ... ...
>     > > Jun 25 03:10:41 r21b02004 tapdisk2[16340]:
>     > > /guest/VM-420A07DA/disk10369/image.vhd: 4
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_ring_open: opening
>     > device blktap3
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_ring_open: opened
>     device 3
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_ring_mmap: blktap:
>     mapping
>     > > pid is 16340
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_validate_params:
>     > > vhd:/guest/VM-420A07DA/disk10369/image.vhd: capacity: 419430400,
>     > > sector-size: 512
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_validate_params:
>     > > vhd:/guest/VM-420A07DA/disk10369/image.vhd: capacity: 419430400,
>     > > sector-size: 512
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_device_create: minor 3
>     > > sectors 419430400 sector-size 512
>     > > Jun 25 03:10:41 r21b02004 kernel: blktap_device_create:
>     creation of
>     > > 252:3: 0
>     > > Jun 25 03:10:41 r21b02004 sshd[16414]: Did not receive
>     > identification
>     > > string from 10.247.10.51
>     > > Jun 25 03:10:41 r21b02004 kernel: device 001107 entered
>     > promiscuous mode
>     > > Jun 25 03:10:41 r21b02004 kernel: eth0: port 3(001107) entering
>     > > forwarding state
>     > > Jun 25 11:16:10 r21b02004 syslogd 1.4.1: restart.
>     > > ... ...
>     > >
>     > >
>     > > Can anyone give me some tips? Thanks!
>     >
>     > Which dom0 kernel are you using?
>     >
>     > J
>     >
>     >
>
>


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.