[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] dom0 alignment check panic due to EFLAGS.AC been set


  • To: xen-devel@xxxxxxxxxxxxx
  • From: Ma JieYue <majieyue@xxxxxxxxx>
  • Date: Sat, 1 Jun 2013 17:27:27 +0800
  • Delivery-date: Sat, 01 Jun 2013 09:30:24 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi, Mr Ian Campbell and other gurus,


We found a xen dom0 alignment check panic problem in our test during
restarting some processes, here is the callstack


alignment check: 0000 [#1] SMP
last sysfs file: /sys/hypervisor/properties/capabilities
CPU 2
Modules linked in: xt_iprange xt_mac arptable_filter arp_tables
xt_physdev 8021q garp xt_state iptable_filter ip_tables autofs4
ipmi_devintf ipmi_si ipmi_msghandler ebtable_filter ebtable_nat
ebtable_broute bridge stp llc ebtables lockd sunrpc bonding ipv6
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack xenfs dm_multipath fuse
xen_netback xen_blkback blktap blkback_pagemap loop nbd video output
sbs sbshc parport_pc lp parport joydev ses enclosure snd_seq_dummy
serio_raw bnx2 snd_seq_oss snd_seq_midi_event snd_seq dcdbas
snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd
soundcore snd_page_alloc pcspkr iTCO_wdt iTCO_vendor_support shpchp
raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy
async_tx raid10 raid1 raid0 cciss
Pid: 8601, comm: connector Not tainted 2.6.32.36xen #1 PowerEdge R710
RIP: e030:[<ffffffffa02ce51a>] [<ffffffffa02ce51a>]
bond_3ad_get_active_agg_info+0x61/0x74 [bonding]
RSP: e02b:ffff88009222b800 EFLAGS: 00050202
RAX: 0000000000000001 RBX: ffff88009222b838 RCX: ffff880250875580
RDX: ffff88024dc76c50 RSI: ffff88009222b838 RDI: ffff88024dc77200
RBP: ffff88009222b808 R08: ffff880246a72f50 R09: ffffffff816fb2a0
R10: ffff8800af2c10e8 R11: ffffffff813cca10 R12: ffff880250875000
R13: ffff8800af2c10e8 R14: ffff880250875580 R15: ffff88024dc1ae80
FS: 00007fd130d61740(0000) GS:ffff880028072000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff1cb42c40 CR3: 00000001f8a5f000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process td_connector (pid: 8601, threadinfo ffff88009222a000, task
ffff88008adcc470)
Stack:
0000000000000002 ffff88009222b878 ffffffffa02cf3db ffff8800af2c10e8
<0> ffff8802508755ac 4f52505f00704550 0000000200000003 0001001100010002
<0> 0000001472655356 0000000000000000 0000000000000002 ffff880250875580
Call Trace:
[<ffffffffa02cf3db>] bond_3ad_xmit_xor+0x70/0x17f [bonding]
[<ffffffffa02ccd1d>] bond_start_xmit+0x391/0x3ea [bonding]
[<ffffffffa0241422>] ? ipv4_confirm+0x179/0x195 [nf_conntrack_ipv4]
[<ffffffff813a3657>] dev_hard_start_xmit+0x1b9/0x27e
[<ffffffff813a644a>] dev_queue_xmit+0x267/0x30e
[<ffffffff813ce523>] ip_finish_output2+0x1a9/0x1ed
[<ffffffff813ce5c9>] ip_finish_output+0x62/0x67
[<ffffffff813ce67c>] ip_output+0xae/0xb5
[<ffffffff813cca20>] dst_output+0x10/0x12
[<ffffffff813ce0d9>] ip_local_out+0x23/0x28
[<ffffffff813cf0fa>] ip_queue_xmit+0x2ce/0x32a
[<ffffffff810acb19>] ? call_rcu_sched+0x15/0x17
[<ffffffff810acb29>] ? call_rcu+0xe/0x10
[<ffffffff8121e3c6>] ? radix_tree_node_free+0x14/0x16
[<ffffffff813dfd6f>] tcp_transmit_skb+0x62d/0x66d
[<ffffffff8100f175>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8100f8d2>] ? check_events+0x12/0x20
[<ffffffff81120369>] ? __d_free+0x50/0x55
[<ffffffff813e118c>] tcp_write_xmit+0x6d8/0x7be
[<ffffffff813e12d7>] __tcp_push_pending_frames+0x2f/0x62
[<ffffffff813e12d7>] __tcp_push_pending_frames+0x2f/0x62
[<ffffffff813e19e3>] tcp_send_fin+0x102/0x10a
[<ffffffff813d59e2>] tcp_close+0x138/0x388
[<ffffffff813f1e0e>] inet_release+0x5d/0x64
[<ffffffff8139361f>] sock_release+0x1f/0x71
[<ffffffff81393af2>] sock_close+0x27/0x2b
[<ffffffff8110f063>] __fput+0x112/0x1b6
[<ffffffff8110f520>] fput+0x1a/0x1c
[<ffffffff8110a5a9>] filp_close+0x6c/0x77
[<ffffffff81058c8b>] put_files_struct+0x7c/0xd0
[<ffffffff81058d18>] exit_files+0x39/0x3e
[<ffffffff8105a059>] do_exit+0x247/0x677
[<ffffffff810673d8>] ? freezing+0x13/0x15
[<ffffffff8105a528>] sys_exit_group+0x0/0x1b
[<ffffffff8106a843>] get_signal_to_deliver+0x300/0x324
[<ffffffff810121da>] do_notify_resume+0x90/0x6d6
[<ffffffff8100c412>] ? xen_mc_flush+0x173/0x195
[<ffffffff8102f82d>] ? paravirt_end_context_switch+0x17/0x31
[<ffffffff8100b459>] ? xen_end_context_switch+0x1e/0x22
[<ffffffff81049a5b>] ? finish_task_switch+0x51/0xa9
[<ffffffff8101303e>] int_signal+0x12/0x17
Code: fc ff ff 48 85 c0 75 e3 83 c8 ff eb 2e 66 8b 42 06 66 89 03 66
8b 42 32 66 89 43 02 8b 42 0c 66 89 43 04 66 8b 42 16 66 89 43 06 <8b>
42 0e 89 43 08 66 8b 42 12 66 89 43 0c 31 c0 5b c9 c3 55 48
RIP [<ffffffffa02ce51a>] bond_3ad_get_active_agg_info+0x61/0x74 [bonding]
RSP <ffff88009222b800>
---[ end trace d269ed1e3064b31a ]---
Kernel panic - not syncing: Fatal exception in interrupt


We guess it is due to the EFLAGS.AC bit set to 1, which leads to CPU
alignment check. Since lots of unaligned memory operations exists in
the kernel, dom0 could panic anywhere. But we have no idea who set
this AC flag at all.


We found some mail may be related to this problem,

http://lists.xen.org/archives/html/xen-devel/2013-01/msg02285.html
http://old-list-archives.xen.org/archives/html/xen-devel/2011-11/msg00827.html
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=660425

but all these posts reported a domU panic (maybe PV domU) , while mine
is related to dom0


The Xen version is 4.0.1 and dom0 kernel comes from jeremy's git tree

http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=ae333e97552c81ab10395ad1ffc6d6daaadb144a

It is xen-2.6.32.36 version of jeremy's dom0 git tree, so I guess
maybe it is too old to be related with CPU SMAP feature



Any help is appreciated, thanks.


Best regards,

jerry

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.