Xen project Mailing List

[Xen-devel] dom0 alignment check panic due to EFLAGS.AC been set

Date: Sat, 1 Jun 2013 17:27:27 +0800

Delivery-date: Sat, 01 Jun 2013 09:30:24 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi, Mr Ian Campbell and other gurus, We found a xen dom0 alignment check panic problem in our test during restarting some processes, here is the callstack alignment check: 0000 [#1] SMP last sysfs file: /sys/hypervisor/properties/capabilities CPU 2 Modules linked in: xt_iprange xt_mac arptable_filter arp_tables xt_physdev 8021q garp xt_state iptable_filter ip_tables autofs4 ipmi_devintf ipmi_si ipmi_msghandler ebtable_filter ebtable_nat ebtable_broute bridge stp llc ebtables lockd sunrpc bonding ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack xenfs dm_multipath fuse xen_netback xen_blkback blktap blkback_pagemap loop nbd video output sbs sbshc parport_pc lp parport joydev ses enclosure snd_seq_dummy serio_raw bnx2 snd_seq_oss snd_seq_midi_event snd_seq dcdbas snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr iTCO_wdt iTCO_vendor_support shpchp raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 cciss Pid: 8601, comm: connector Not tainted 2.6.32.36xen #1 PowerEdge R710 RIP: e030:[<ffffffffa02ce51a>] [<ffffffffa02ce51a>] bond_3ad_get_active_agg_info+0x61/0x74 [bonding] RSP: e02b:ffff88009222b800 EFLAGS: 00050202 RAX: 0000000000000001 RBX: ffff88009222b838 RCX: ffff880250875580 RDX: ffff88024dc76c50 RSI: ffff88009222b838 RDI: ffff88024dc77200 RBP: ffff88009222b808 R08: ffff880246a72f50 R09: ffffffff816fb2a0 R10: ffff8800af2c10e8 R11: ffffffff813cca10 R12: ffff880250875000 R13: ffff8800af2c10e8 R14: ffff880250875580 R15: ffff88024dc1ae80 FS: 00007fd130d61740(0000) GS:ffff880028072000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fff1cb42c40 CR3: 00000001f8a5f000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process td_connector (pid: 8601, threadinfo ffff88009222a000, task ffff88008adcc470) Stack: 0000000000000002 ffff88009222b878 ffffffffa02cf3db ffff8800af2c10e8 <0> ffff8802508755ac 4f52505f00704550 0000000200000003 0001001100010002 <0> 0000001472655356 0000000000000000 0000000000000002 ffff880250875580 Call Trace: [<ffffffffa02cf3db>] bond_3ad_xmit_xor+0x70/0x17f [bonding] [<ffffffffa02ccd1d>] bond_start_xmit+0x391/0x3ea [bonding] [<ffffffffa0241422>] ? ipv4_confirm+0x179/0x195 [nf_conntrack_ipv4] [<ffffffff813a3657>] dev_hard_start_xmit+0x1b9/0x27e [<ffffffff813a644a>] dev_queue_xmit+0x267/0x30e [<ffffffff813ce523>] ip_finish_output2+0x1a9/0x1ed [<ffffffff813ce5c9>] ip_finish_output+0x62/0x67 [<ffffffff813ce67c>] ip_output+0xae/0xb5 [<ffffffff813cca20>] dst_output+0x10/0x12 [<ffffffff813ce0d9>] ip_local_out+0x23/0x28 [<ffffffff813cf0fa>] ip_queue_xmit+0x2ce/0x32a [<ffffffff810acb19>] ? call_rcu_sched+0x15/0x17 [<ffffffff810acb29>] ? call_rcu+0xe/0x10 [<ffffffff8121e3c6>] ? radix_tree_node_free+0x14/0x16 [<ffffffff813dfd6f>] tcp_transmit_skb+0x62d/0x66d [<ffffffff8100f175>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100f8d2>] ? check_events+0x12/0x20 [<ffffffff81120369>] ? __d_free+0x50/0x55 [<ffffffff813e118c>] tcp_write_xmit+0x6d8/0x7be [<ffffffff813e12d7>] __tcp_push_pending_frames+0x2f/0x62 [<ffffffff813e12d7>] __tcp_push_pending_frames+0x2f/0x62 [<ffffffff813e19e3>] tcp_send_fin+0x102/0x10a [<ffffffff813d59e2>] tcp_close+0x138/0x388 [<ffffffff813f1e0e>] inet_release+0x5d/0x64 [<ffffffff8139361f>] sock_release+0x1f/0x71 [<ffffffff81393af2>] sock_close+0x27/0x2b [<ffffffff8110f063>] __fput+0x112/0x1b6 [<ffffffff8110f520>] fput+0x1a/0x1c [<ffffffff8110a5a9>] filp_close+0x6c/0x77 [<ffffffff81058c8b>] put_files_struct+0x7c/0xd0 [<ffffffff81058d18>] exit_files+0x39/0x3e [<ffffffff8105a059>] do_exit+0x247/0x677 [<ffffffff810673d8>] ? freezing+0x13/0x15 [<ffffffff8105a528>] sys_exit_group+0x0/0x1b [<ffffffff8106a843>] get_signal_to_deliver+0x300/0x324 [<ffffffff810121da>] do_notify_resume+0x90/0x6d6 [<ffffffff8100c412>] ? xen_mc_flush+0x173/0x195 [<ffffffff8102f82d>] ? paravirt_end_context_switch+0x17/0x31 [<ffffffff8100b459>] ? xen_end_context_switch+0x1e/0x22 [<ffffffff81049a5b>] ? finish_task_switch+0x51/0xa9 [<ffffffff8101303e>] int_signal+0x12/0x17 Code: fc ff ff 48 85 c0 75 e3 83 c8 ff eb 2e 66 8b 42 06 66 89 03 66 8b 42 32 66 89 43 02 8b 42 0c 66 89 43 04 66 8b 42 16 66 89 43 06 <8b> 42 0e 89 43 08 66 8b 42 12 66 89 43 0c 31 c0 5b c9 c3 55 48 RIP [<ffffffffa02ce51a>] bond_3ad_get_active_agg_info+0x61/0x74 [bonding] RSP <ffff88009222b800> ---[ end trace d269ed1e3064b31a ]--- Kernel panic - not syncing: Fatal exception in interrupt We guess it is due to the EFLAGS.AC bit set to 1, which leads to CPU alignment check. Since lots of unaligned memory operations exists in the kernel, dom0 could panic anywhere. But we have no idea who set this AC flag at all. We found some mail may be related to this problem, http://lists.xen.org/archives/html/xen-devel/2013-01/msg02285.html http://old-list-archives.xen.org/archives/html/xen-devel/2011-11/msg00827.html http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=660425 but all these posts reported a domU panic (maybe PV domU) , while mine is related to dom0 The Xen version is 4.0.1 and dom0 kernel comes from jeremy's git tree http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=ae333e97552c81ab10395ad1ffc6d6daaadb144a It is xen-2.6.32.36 version of jeremy's dom0 git tree, so I guess maybe it is too old to be related with CPU SMAP feature Any help is appreciated, thanks. Best regards, jerry _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.