[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [XenPPC] [RFC] 'xm restore' following boot
'xm restore' immediately following boot usually wedges the cpu. However, xm save followed by xm restore works fine (even when guest domain and htab are relocated to new memory areas). ^AAA shows: with .plpar_hcall_norets @ c00000000003af78 and .HYPERVISOR_sched_op @ c00000000004415c (XEN) *** Dumping CPU3 state: *** (XEN) ----[ Xen-3.0-unstable ]---- (XEN) CPU: 00000003 DOMID: 00000001 (XEN) pc c00000000003af88 msr 8000000000009032 (XEN) lr c000000000044210 ctr c000000000044238 (XEN) srr0 ffffffffffffffff srr1 ffffffffffffffff (XEN) r00: 0000000024555548 c00000000065bcb0 c000000000656630 0000000000000000 (XEN) r04: 0000000000000001 0000000000000000 0000000024555542 c00000000000fc24 (XEN) r08: 00000000ecf515a8 c000000000044238 0000000000989680 c0000000000441a4 (XEN) r12: 0000000001a9f9f8 c00000000052e300 5555555555555555 5555555555555555 (XEN) r16: 5555555555555555 5555555555555555 5555555555555555 5555555555555555 (XEN) r20: 5555555555555555 5555555555555555 5555555555555555 5555555555555555 (XEN) r24: 5555555555555555 5555555555555555 4000000000000000 c000000000000000 (XEN) r28: 0000000000000000 0000000000000010 c00000000053d3c8 0000000000000001 (XEN) reprogram_timer[00] Timeout in the past 0x0000004332DBA479 > 0x00000042C2424DF3 Here are typical console with debug prints and execptions: If 'xm restore' is run several times, often it will start working, though the exceptions still occur... (user domain has ramdisk & networking) At the bottom, some code specified by a couple Exceptions... 1. 'xm restore' following xm save: cso84:~ # xm console 6 mfdec: -12 TIMEBASE_FREQ: 71592390 Here we're resuming hid4: 0x6200120000000042 arch_gnttab_map: grant table at d000080080000000 irq_resume() switch_idle_mm() mfdec: 14315899 __sti() xencons_resume() xenbus_resume() smp_resume() mfdec: 63024 returning netfront: device eth0 has copying receive path. [user@bringup /]# 2. reboot with 'xm restore' that worked 1st time: cso84:~ # xm console 1 mfdec: -14 TIMEBASE_FREQ: 71592390 Here we're resuming hid4: 0x6000120000000041 arch_gnttab_map: grant table at d000080080000000 irq_resume() switch_idle_mm() mfdec: 14315924 __sti() xencons_resume() xenbus_resume() BUG: soft lockup detected on CPU#0! Call Trace: [C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable) [C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128 [C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30 [C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0 [C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100 --- Exception: 901 at .handle_IRQ_event+0x4c/0x13c LR = .__do_IRQ+0x1ac/0x2b4 [C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable) [C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4 [C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4 [C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28 [C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100 [C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10 --- Exception: 501 at .plpar_hcall_norets+0x10/0x1c LR = .HYPERVISOR_sched_op+0xb4/0x10c [C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable) [C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98 [C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154 [C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c [C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308 [C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54 smp_resume() mfdec: 90178 returning netfront: device eth0 has copying receive path. [user@bringup /]# 3. reboot with typical wedge: cso84:~ # xm console 1 mfdec: -12 TIMEBASE_FREQ: 71592390 Here we're resuming hid4: 0x6000120000000041 arch_gnttab_map: grant table at d000080080000000 irq_resume() switch_idle_mm() mfdec: 14315903 __sti() xencons_resume() xenbus_resume() smp_resume() mfdec: 14218880 returning BUG: soft lockup detected on CPU#0! Call Trace: [C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable) [C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128 [C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30 [C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0 [C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100 --- Exception: 901 at .handle_IRQ_event+0x4c/0x13c LR = .__do_IRQ+0x1ac/0x2b4 [C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable) [C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4 [C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4 [C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28 [C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100 [C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10 --- Exception: 501 at .plpar_hcall_norets+0x10/0x1c LR = .HYPERVISOR_sched_op+0xb4/0x10c [C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable) [C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98 [C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154 [C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c [C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308 [C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54 cso84:~ # 4. reboot with another wedge: cso84:~ # xm console 1 mfdec: -12 TIMEBASE_FREQ: 71592390 Here we're resuming hid4: 0x6000120000000041 arch_gnttab_map: grant table at d000080080000000 irq_resume() switch_idle_mm() mfdec: 14315908 __sti() xencons_resume() xenbus_resume() BUG: soft lockup detected on CPU#0! Call Trace: [C000000001AA3650] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable) [C000000001AA3700] [C00000000008956C] .softlockup_tick+0x100/0x128 [C000000001AA37C0] [C000000000065BC0] .run_local_timers+0x1c/0x30 [C000000001AA3840] [C000000000023C60] .timer_interrupt+0x108/0x4f0 [C000000001AA3970] [C0000000000034EC] decrementer_common+0xec/0x100 --- Exception: 901 at .plpar_hcall_norets+0x10/0x1c LR = .HYPERVISOR_event_channel_op+0x34/0x50 [C000000001AA3C60] [C0000000000442E4] .HYPERVISOR_event_channel_op+0x1c/0x50 (un reliable) [C000000001AA3CF0] [C0000000002BD1F0] .xb_read+0x190/0x2ac [C000000001AA3E30] [C0000000002BEFD4] .xenbus_thread+0x84/0x278 [C000000001AA3EE0] [C000000000074D08] .kthread+0x158/0x1a8 [C000000001AA3F90] [C000000000028310] .kernel_thread+0x4c/0x68 cso84:~ # Some code, for example 3: --- Exception: 901 at .handle_IRQ_event+0x4c/0x13c : c000000000089d2c 0:mon> di c000000000089d20 c000000000089d20 7c0000a6 mfmsr r0 c000000000089d24 60008000 ori r0,r0,32768 c000000000089d28 7c010164 mtmsrd r0,1 c000000000089d2c 7c7d07b4 extsw r29,r3 c000000000089d30 48000010 b c000000000089d40 # .handle_IRQ_event+0x60/0x13c c000000000089d34 ebff0028 ld r31,40(r31) c000000000089d38 2fbf0000 cmpdi cr7,r31,0 c000000000089d3c 419e005c beq cr7,c000000000089d98 # .handle_IRQ_event+0xb8/0x13c --- Exception: 501 at .plpar_hcall_norets+0x10/0x1c : c00000000003af988 0:mon> di c00000000003af78 c00000000003af78 7c421378 mr r2,r2 c00000000003af7c 7c000026 mfcr r0 c00000000003af80 90010008 stw r0,8(r1) c00000000003af84 44000022 svca 8 c00000000003af88 80010008 lwz r0,8(r1) c00000000003af8c 7c0ff120 mtcr r0 c00000000003af90 4e800020 blr _______________________________________________ Xen-ppc-devel mailing list Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ppc-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |