[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen 4.3 test report



migration with qemu-xen-traditional:
xen16:~ # xl migrate --debug 21-10887 ib-xen06.kh11.clodo.ru
the global config option vifscript is deprecated, please switch to
vif.default.script
the global config option vifscript is deprecated, please switch to
vif.default.script
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x0/0x0/631)
Loading new save file <incoming migration stream> (new xl fmt info 0x0/0x0/631)
 Savefile contains xl domain config
xc: progress: Reloading memory pages: 53248/1048576    5%
xc: progress: Reloading memory pages: 105472/1048576   10%
xc: progress: Reloading memory pages: 157658/1048576   15%
xc: progress: Reloading memory pages: 209882/1048576   20%
xc: progress: Reloading memory pages: 263130/1048576   25%
migration receiver stream contained unexpected data instead of ready message
(command run was: exec ssh ib-xen06.kh11.clodo.ru xl migrate-receive -d )
migration target: Transfer complete, requesting permission to start domain.
libxl: error: libxl_utils.c:393:libxl_read_exactly: file/stream
truncated reading GO message from migration stream
migration target: Failure, destroying our copy.
migration child [15697] not exiting, no longer waiting (exit status
will be unreported)
Migration failed, resuming at sender.
migration target: Cleanup OK, granting sender permission to resume.

xl dmesg:
(XEN) event_channel.c:297:d1 d1v0 [evtchn_bind_virq:297], port:3, rc:-17
(XEN) event_channel.c:298:d1 EVTCHNOP failure: error -17


xl console:
[  981.869689] PM: late freeze of devices complete after 0.073 msecs
[  981.873833] ------------[ cut here ]------------
[  981.873833] kernel BUG at
/build/buildd-linux_3.2.41-2+deb7u2-amd64-NHQI9B/linux-3.2.41/drivers/xen/events.c:1489!
[  981.873833] invalid opcode: 0000 [#1] SMP
[  981.873833] CPU 0
[  981.873833] Modules linked in: xenfs snd_pcm snd_page_alloc
snd_timer snd coretemp soundcore crc32c_intel evdev joydev pcspkr ext3
mbcache jbd xen_blkfront xen_netfront
[  981.873833]
[  981.873833] Pid: 6, comm: migration/0 Not tainted 3.2.0-4-amd64 #1
Debian 3.2.41-2+deb7u2
[  981.873833] RIP: e030:[<ffffffff8121c4e2>]  [<ffffffff8121c4e2>]
xen_irq_resume+0xbd/0x28b
[  981.873833] RSP: e02b:ffff88001ae99d20  EFLAGS: 00010082
[  981.873833] RAX: ffffffffffffffef RBX: 0000000000000000 RCX: 0000000000000001
[  981.873833] RDX: 0000000000000000 RSI: 00000000deadbeef RDI: 00000000deadbeef
[  981.873833] RBP: 0000000000000000 R08: ffff88001f026e00 R09: ffff88001ae99d48
[  981.873833] R10: 0000000000013780 R11: 0000000000013780 R12: 0000000000000010
[  981.873833] R13: 0000000000010dd0 R14: 0000000000010d70 R15: 0000000000000000
[  981.873833] FS:  00007f1fff8d37a0(0000) GS:ffff88001fc00000(0000)
knlGS:0000000000000000
[  981.873833] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  981.873833] CR2: 000000f8400b5410 CR3: 00000000033ad000 CR4: 0000000000002660
[  981.873833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  981.873833] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  981.873833] Process migration/0 (pid: 6, threadinfo
ffff88001ae98000, task ffff88001ae8e0c0)
[  981.873833] Stack:
[  981.873833]  0000000000013780 0000000000000000 ffff880000000000
0000000000010d70
[  981.873833]  0000160000000000 0000000000000000 ffff88001affbddc
ffffffff810050a2
[  981.873833]  0000000000013780 ffffea00005ab258 ffffffff810043e3
ffff88001affbe40
[  981.873833] Call Trace:
[  981.873833]  [<ffffffff810050a2>] ? xen_mc_issue+0x3e/0x50
[  981.873833]  [<ffffffff810043e3>] ? arch_local_irq_restore+0x7/0x8
[  981.873833]  [<ffffffff8121ca3b>] ? xen_suspend+0x73/0x8b
[  981.873833]  [<ffffffff81087d91>] ? stop_machine_cpu_stop+0x89/0xc3
[  981.873833]  [<ffffffff81087d08>] ? queue_stop_cpus_work+0xa5/0xa5
[  981.873833]  [<ffffffff81087b62>] ? cpu_stopper_thread+0xea/0x177
[  981.873833]  [<ffffffff810359d7>] ? arch_local_irq_enable+0x7/0x8
[  981.873833]  [<ffffffff81039854>] ? finish_task_switch+0x88/0xb9
[  981.873833]  [<ffffffff8134c694>] ? __schedule+0x5ac/0x5c3
[  981.873833]  [<ffffffff81087a78>] ? cpu_stop_signal_done+0x2a/0x2a
[  981.873833]  [<ffffffff8105f329>] ? kthread+0x76/0x7e
[  981.873833]  [<ffffffff81354b34>] ? kernel_thread_helper+0x4/0x10
[  981.873833]  [<ffffffff81352bf3>] ? int_ret_from_sys_call+0x7/0x1b
[  981.873833]  [<ffffffff8134dd3c>] ? retint_restore_args+0x5/0x6
[  981.873833]  [<ffffffff81354b30>] ? gs_change+0x13/0x13
[  981.873833] Code: 74 79 44 89 e7 e8 77 ee ff ff 39 e8 74 02 0f 0b
48 8d 74 24 28 bf 01 00 00 00 89 6c 24 28 89 5c 24 2c e8 19 ec ff ff
85 c0 74 02 <0f> 0b 8b 44 24 30 44 89 e7 89 44 24 14 e8 58 e9 ff ff 0f
b7 4c
[  981.873833] RIP  [<ffffffff8121c4e2>] xen_irq_resume+0xbd/0x28b
[  981.873833]  RSP <ffff88001ae99d20>
[  981.873833] ---[ end trace 8243bb8e343ac633 ]---
[  981.873833] ------------[ cut here ]------------
[  981.873833] WARNING: at
/build/buildd-linux_3.2.41-2+deb7u2-amd64-NHQI9B/linux-3.2.41/kernel/time/timekeeping.c:265
ktime_get+0x1e/0x86()
[  981.873833] Modules linked in: xenfs snd_pcm snd_page_alloc
snd_timer snd coretemp soundcore crc32c_intel evdev joydev pcspkr ext3
mbcache jbd xen_blkfront xen_netfront
[  981.873833] Pid: 0, comm: swapper/0 Tainted: G      D
3.2.0-4-amd64 #1 Debian 3.2.41-2+deb7u2
[  981.873833] Call Trace:
[  981.873833]  [<ffffffff81046a55>] ? warn_slowpath_common+0x78/0x8c
[  981.873833]  [<ffffffff8106644f>] ? ktime_get+0x1e/0x86
[  981.873833]  [<ffffffff8106c223>] ? tick_nohz_stop_sched_tick+0x61/0x327
[  981.873833]  [<ffffffff8100d210>] ? cpu_idle+0x72/0xf2
[  981.873833]  [<ffffffff816abb36>] ? start_kernel+0x3b8/0x3c3
[  981.873833]  [<ffffffff816ad4d9>] ? xen_start_kernel+0x412/0x418
[  981.873833] ---[ end trace 8243bb8e343ac634 ]---

2013/5/25 Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>:
> On Sat, May 25, 2013 at 12:15:44AM +0400, Vasiliy Tolstov wrote:
>> 2013/5/24 George Dunlap <George.Dunlap@xxxxxxxxxxxxx>:
>> >
>> > Did you mean xm save or xl save?
>>
>>
>> In my case xl save crash domU with messages like followind. And domU
>> crashes centos 2.6.18 and 2.6.32 (xenlinux) and never 3.8.6 kernel and
>> 3.4...
>
> Is the 3.8.6 crashing at the same point?
>>
>> [ 1826.587110] PM: late freeze of devices complete after 0.048 msecs
>> [ 1826.591220] ------------[ cut here ]------------
>> [ 1826.591220] kernel BUG at
>> /build/buildd-linux_3.2.41-2-amd64-Wvc92F/linux-3.2.41/drivers/xen/events.c:1489!
>
> That looks to be this 
> (https://git.kernel.org/cgit/linux/kernel/git/bwh/linux-3.2.y.git/tree/drivers/xen/events.c)
>
>         if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
>                                                 &bind_virq) != 0)
>                         BUG();
>
> which is odd. Would you be able to instrument evtchn_bind_virq (this is
> in Xen) with some printks, like this (hand't compile tested it):
>
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index 2d7afc9..c109cee 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -270,24 +270,34 @@ static long evtchn_bind_virq(evtchn_bind_virq_t *bind)
>      int            port, virq = bind->virq, vcpu = bind->vcpu;
>      long           rc = 0;
>
> -    if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
> +    if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) ) }
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], virq:%d, rc:%ld\n", d->domain_id,
> +       vcpu, __func__,__LINE__, virq, -EINVAL);
>          return -EINVAL;
> -
> -    if ( virq_is_global(virq) && (vcpu != 0) )
> +    }
> +    if ( virq_is_global(virq) && (vcpu != 0) ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], virq_is_global:%d, rc:%ld\n", 
> d->domain_id,
> +       vcpu, __func__,__LINE__, virq_is_global(virq), -EINVAL);
>          return -EINVAL;
> -
> +    }
>      if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
> -         ((v = d->vcpu[vcpu]) == NULL) )
> +         ((v = d->vcpu[vcpu]) == NULL) ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], v:%p, max_vcpus:%d, rc:%ld\n", 
> d->domain_id,
> +       vcpu, __func__,__LINE__, v, d->max_vcpus, -ENOENT);
>          return -ENOENT;
> -
> +    }
>      spin_lock(&d->event_lock);
>
> -    if ( v->virq_to_evtchn[virq] != 0 )
> +    if ( v->virq_to_evtchn[virq] != 0 ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], v:%p, evtchn:%d, rc:%ld\n", 
> d->domain_id,
> +       vcpu, __func__,__LINE__, v->virq_to_evtchn[virq] , -EEXIST);
>          ERROR_EXIT(-EEXIST);
> -
> -    if ( (port = get_free_port(d)) < 0 )
> +    }
> +    if ( (port = get_free_port(d)) < 0 ) {
> +gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], port:%d, rc:%ld\n", d->domain_id,
> +       vcpu, __func__,__LINE__, port, port);
>          ERROR_EXIT(port);
> -
> +    }
>      chn = evtchn_from_port(d, port);
>      chn->state          = ECS_VIRQ;
>      chn->notify_vcpu_id = vcpu;



-- 
Vasiliy Tolstov,
e-mail: v.tolstov@xxxxxxxxx
jabber: vase@xxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.