[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Test results on Unisys ES7000 32x 128gb usingxen-unstable (c/s 15470) - 2 old issues; 2 new issue


  • To: "Krysan, Susan" <KRYSANS@xxxxxxxxxx>, Ian Pratt <Ian.Pratt@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: Keir Fraser <keir@xxxxxxxxxxxxx>
  • Date: Mon, 09 Jul 2007 17:21:25 +0100
  • Delivery-date: Mon, 09 Jul 2007 09:19:25 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AcdJ4UYHnZd8QwLHTMSjo0sqxNkrmgE+CxDgAZa+3qABVnT68AFk5dvwAufQgQABArjgkAIyIiNQA7uSpdABgtM2EAIyrZhAAA2mOPADEjFQkAeOTDJwACNN3+AAJZeOEAABOtRgAALTRvY=
  • Thread-topic: [Xen-devel] Test results on Unisys ES7000 32x 128gb usingxen-unstable (c/s 15470) - 2 old issues; 2 new issue

This is pretty simple. The domain's memory map is torn down synchronously
when it is killed by domain0, via XEN_DOMCTL_destroydomain. This can take a
long time, and during that time that domain0 vcpu is not interruptible.

The possible fixes are:
 1. Find out which bit of domain_kill() takes the longest time and optimise
it so it takes much less time. Unfortunately it is still going to be
proportional to the memory size of the domain.
 2. Make domain destruction asynchronous (probably by introducing hypervisor
threads, analogous to kernel threads).

Possibly we need both (1) and (2). I don't think we can avoid doing (2) in
the long term, really.

 -- Keir

On 9/7/07 16:01, "Krysan, Susan" <KRYSANS@xxxxxxxxxx> wrote:

> Also, the serial port and logs give no indication of the problem.  I
> will rebuild with debug and try again.
> 
> Thanks,
> Sue Krysan
> Linux Systems Group
> Unisys Corporation
>  
> 
> -----Original Message-----
> From: Krysan, Susan
> Sent: Monday, July 09, 2007 10:55 AM
> To: 'Ian Pratt'; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-devel] Test results on Unisys ES7000 32x 128gb
> usingxen-unstable (c/s 15470) - 2 old issues; 2 new issue
> 
> Yes, the problem was happening on large memory guests, PV and VT, and I
> only tried 64-bit guests with such a large size.
> 
> However, the server I was testing on recently had software installed in
> dom0 which monitors the servers up time (heartbeat) and when the
> heartbeat is lost, the software halts the server.  Large memory guests
> take a long time to start and shut down, but during shut down, the host
> becomes unresponsive.  With the heartbeat monitoring enabled, this
> unresponsiveness was resulting in the software halting the server.  I
> disabled this feature and the host no longer halts.
> 
> Do you have any ideas as to where we can look to determine why the
> server seems to be starved for resources, and thus unable to respond to
> the heartbeat, when a large virtual machine is shutting down?  Keep in
> mind that we run with dom0_mem=512M because otherwise starting the large
> VMs takes much longer.
> 
> Thanks,
> Sue Krysan
> Linux Systems Group
> Unisys Corporation
>  
> -----Original Message-----
> From: Ian Pratt [mailto:Ian.Pratt@xxxxxxxxxxxx]
> Sent: Sunday, July 08, 2007 4:36 PM
> To: Krysan, Susan; xen-devel@xxxxxxxxxxxxxxxxxxx
> Cc: ian.pratt@xxxxxxxxxxxx
> Subject: RE: [Xen-devel] Test results on Unisys ES7000 32x 128gb
> usingxen-unstable (c/s 15470) - 2 old issues; 2 new issue
> 
>> Host:  Unisys ES7000/one, x86_64, 32 physical processors, 128 GB RAM
>> 
>> 2 OLD ISSUES:
>> 
>> Host halts upon shutdown of 126000mb domU and domVT
>> 
>> Testing includes running xm-test and also attempting to boot and run
>> programs in the following domUs and domVTs (running domains #s 3
>> through 9 simultaneously):
>>  
>> 1.       32-processor 64-bit SLES10 domU with 126gb (126000mb) memory
> -
>> run kernbench optimal load
>> 
>> 2.       32-processor 64-bit SLES10 domVT with 126gb (126000mb) memory
>> - run kernbench optimal load
>> 
>> 
>> Domain #1 and #2 - able to run kernbench in these domains, but host
>> crashes when shut them down; serport does not provide a reason why.
>> 
>> Reducing the memory of these domains to 124000mb used to work, but
> have
>> tested 122000mb and 120000mb and host still halts. Still running tests
>> to determine the largest size domain that works.
> 
> Are you saying that booting a single very large 64b guest either VT or
> PV and then shutting it down will kill the host?
> 
> If you boot a debug build of xen, do you get any messages out the serial
> port?
> 
> Thanks,
> Ian
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.