[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] resume from S3 sleep not working in Dom0 - Xen4.2.1
On Mon, Feb 04, 2013 at 11:07:23AM +0100, Tomasz Wroblewski wrote: > > >>fix-suspend-scheduler-v2 > >>fix-suspend-scheduler-revert-affinity-part > >>s3-timerirq > >> > >>All of these fixes have been proposed to the xen-devel list, but have > >>not yet been accepted, for one reason, or another. > >And I don't think comments on them have seen follow-ups. > > > >Jan > > > I guess it's worth bringing this up again; > > s3-timerirq: this was empirical hack which for some reason is needed > on stable 4.2 we use, but not on latest unstable, didn't really > investigate further since it appeared fixed later on anyway.. > > fix-suspend-scheduler/revert-affinity: the big objection here was > the part which reverts one of the hunks in Keir's commit. I tried > for quite few days to find a working fix which does not do this > revert using posted suggestions, but was not succesfull: > > - there was a crash in xen scheduler, which was fixable using your > suggestion of masking softirqs during s3 (ugly) > - there was also a crash in xen acpi cpufreq driver, which was > similarily fixable using a bandaid s3 condition (ugly) > - unfortunately this turned out to not be all, xen did not crash > anymore at this point but dom0 kernel did around the time it enables > cpus, in multiple places: at this point I didn't have a good > explanation for it, my opinion of aggravating hunk was rather low, > so I uttered a hearty curse and stuck a revert into private > patchqueue. > > The dom0 kernel crashes were as follows: > > 1) > > [ 60.657751] Enabling non-boot CPUs ... > [ 60.657958] installing Xen timer for CPU 1 > [ 60.657987] cpu 1 spinlock event irq 279 > [ 60.658101] Disabled fast string operations > [ 60.658466] CPU1 is up > [ 60.658736] installing Xen timer for CPU 2 > [ 60.658784] cpu 2 spinlock event irq 285 > [ 60.659764] Disabled fast string operations > [ 60.661811] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000018 > [ 60.661817] IP: [<ffffffff8105f700>] > build_sched_domains+0x770/0(XEN) *** Serial input -> Xen (type > 'CTRL-a' three times to switch input to DOM0) > > > > > 2) > .332997] installing Xen timer for CPU 2emory > [ 36.333061] cpu 2 spinlock event irq 285 > [ 36.333343] Disabled fast string operations > [ 36.334939] CPU2 is up > [ 36.335213] installing Xen timer for CPU 3 > [ 36.335244] cpu 3 spinlock event irq 291 > [ 36.335561] Disabled fast string operations > [ 36.337461] CPU3 is up > [ 36.339513] ACPI: Waking up from system sleep state S3 > [ 36.350193] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000004 > [ 36.350211] IP: [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0 > [ 36.350236] PGD 2f19067 PUD 2ec7067 PMD 0 > [ 36.350252] Oops: 0000 [#1] SMP > [ 36.350263] CPU 1 > [ 36.350267] Modules linked in: xt_mac ipt_MASQUERADE > ebtable_filter ebtables iscsi_scst(O) xt_tcpudp scst_vdisk(O) > xt_state crc32c xt_multiport libcrc32c iptable_filter iptable_nat > nf_nat nf_conntrack_ipv4 nf_conntrack scst_cdrom(O) nf_defrag_ipv4 > ip_tables scst(O) x_tables bridge stp llc nls_cp437 isofs zram(C) > snd_hda_codec_hdmi snd_hda_codec_conexant microcode arc4 psmouse > serio_raw i915 drm_kms_helper drm iwlwifi(O) mac80211(O) cfg80211(O) > thinkpad_acpi nvram snd_hda_intel snd_hda_codec snd_hwdep snd_pcm > snd_timer snd soundcore snd_page_alloc i2c_algo_bit intel_agp video > intel_gtt tpm_tis tpm tpm_bios sdhci_pci sdhci ehci_hcd e1000e > [ 36.350437] > [ 36.350445] Pid: 2730, comm: bash Tainted: G C O > 3.2.23-orc #19 LENOVO 42404EU/42404EU > [ 36.350463] RIP: e030:[<ffffffff81055f9a>] [<ffffffff81055f9a>] > find_busiest_group+0x38a/0xbb0 > [ 36.350481] RSP: e02b:ffff880002b71228 EFLAGS: 00010046 > [ 36.350490] RAX: 0000000000000040 RBX: 0000000000000000 RCX: > 0000000000000000 > [ 36.350500] RDX: 0000000000000000 RSI: 0000000000000040 RDI: > 0000000000000000 > [ 36.350510] RBP: ffff880002b713b8 R08: ffff880026109f00 R09: > 0000000000000000 > [ 36.350519] R10: 0000000000000000 R11: 0000000000000001 R12: > 0000000000000000 > [ 36.350529] R13: ffff880026109f80 R14: ffffffffffffffff R15: > ffff880026109f98 > [ 36.350547] FS: 00007fc41e295700(0000) GS:ffff88002dc40000(0000) > knlGS:0000000000000000 > [ 36.350558] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 36.350566] CR2: 0000000000000004 CR3: 0000000026329000 CR4: > 0000000000002660 > [ 36.350577] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 36.350587] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 36.350598] Process bash (pid: 2730, threadinfo ffff880002b70000, > task ffff880027a7db40) > [ 36.350608] Stack: > [ 36.350613] 00ffffff00000002 0000000300000001 ffff880002b71498 > ffff880002b71534 > [ 36.350630] 00ffffff00000002 0000000100000001 ffff8800262cf000 > 0000000000000008 > [ 36.350646] ffffffff00000000 0000000000000000 0000000000000000 > ffff88002dc4e2c8 > [ 36.350662] Call Trace: > [ 36.350677] [<ffffffff8105b158>] load_balance+0xb8/0x840 > [ 36.350690] [<ffffffff8101b909>] ? sched_clock+0x9/0x10 > [ 36.350706] [<ffffffff8108ccad>] ? sched_clock_cpu+0xbd/0x110 > [ 36.350718] [<ffffffff81052b1c>] ? update_shares+0xcc/0x100 > [ 36.350735] [<ffffffff8157b9b5>] __schedule+0x875/0x8d0 > [ 36.350749] [<ffffffff81073ae2>] ? try_to_del_timer_sync+0x92/0x130 > [ 36.350762] [<ffffffff8157bd3f>] schedule+0x3f/0x60 > [ 36.350773] [<ffffffff8157c24d>] schedule_timeout+0x16d/0x320 > [ 36.350786] [<ffffffff810728e0>] ? usleep_range+0x50/0x50 > [ 36.350800] [<ffffffff8157de2e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30 > [ 36.350817] [<ffffffff8130c340>] > acpi_ec_transaction_unlocked+0x134/0x1d8 > [ 36.350830] [<ffffffff81086b90>] ? add_wait_queue+0x60/0x60 > [ 36.350842] [<ffffffff8130c6c6>] acpi_ec_transaction+0x196/0x239 > [ 36.350856] [<ffffffff8157de2e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30 > [ 36.350869] [<ffffffff8130c8a0>] acpi_ec_write+0x40/0x42 > [ 36.350881] [<ffffffff8130c9a8>] acpi_ec_space_handler+0x9e/0xfc > [ 36.350894] [<ffffffff8130c90a>] ? acpi_ec_burst_disable+0x3d/0x3d > [ 36.350909] [<ffffffff813159c6>] > acpi_ev_address_space_dispatch+0x179/0x1c8 > [ 36.350924] [<ffffffff8131aafe>] acpi_ex_access_region+0x23e/0x24b > [ 36.350936] [<ffffffff8106e82c>] ? __sysctl_head_next+0x11c/0x130 > [ 36.350951] [<ffffffff8131ae15>] acpi_ex_field_datum_io+0xf9/0x17a > [ 36.350965] [<ffffffff8131b148>] > acpi_ex_write_with_update_rule+0xb5/0xc1 > [ 36.350989] [<ffffffff8131acfa>] acpi_ex_insert_into_field+0x1ef/0x211 > [ 36.351003] [<ffffffff8132b5a7>] ? > acpi_ut_allocate_object_desc_dbg+0x45/0x7f > [ 36.351018] [<ffffffff8131980e>] acpi_ex_write_data_to_field+0x194/0x1c2 > [ 36.351031] [<ffffffff813131e4>] ? > acpi_ds_init_object_from_op+0x137/0x231 > [ 36.351044] [<ffffffff8131d94f>] acpi_ex_store_object_to_node+0xa3/0xe2 > [ 36.351056] [<ffffffff8131da51>] acpi_ex_store+0xc3/0x256 > [ 36.351066] [<ffffffff8131b62b>] acpi_ex_opcode_1A_1T_1R+0x353/0x4a5 > [ 36.351078] [<ffffffff8131260c>] acpi_ds_exec_end_op+0xf7/0x3e7 > [ 36.351092] [<ffffffff81325ae7>] acpi_ps_parse_loop+0x7bd/0x94e > [ 36.351105] [<ffffffff81324ed9>] acpi_ps_parse_aml+0x96/0x275 > [ 36.351119] [<ffffffff81326394>] acpi_ps_execute_method+0x1ce/0x276 > [ 36.351131] [<ffffffff8132165b>] acpi_ns_evaluate+0xdf/0x1aa > [ 36.351144] [<ffffffff81320c9d>] acpi_evaluate_object+0xfb/0x1f4 > [ 36.351156] [<ffffffff8130f8ee>] acpi_device_sleep_wake+0x95/0xc7 > [ 36.351168] [<ffffffff8130fa60>] > acpi_disable_wakeup_device_power+0x6e/0xc9 > [ 36.351182] [<ffffffff813085e2>] acpi_disable_wakeup_devices+0x7b/0x95 > [ 36.351194] [<ffffffff81308710>] acpi_pm_finish+0x39/0x55 > [ 36.351208] [<ffffffff810a6034>] suspend_devices_and_enter+0x104/0x310 > [ 36.351222] [<ffffffff810a63a7>] enter_state+0x167/0x190 > [ 36.351234] [<ffffffff810a4d27>] state_store+0xb7/0x130 > [ 36.351246] [<ffffffff812b54df>] kobj_attr_store+0xf/0x30 > [ 36.351260] [<ffffffff811d382f>] sysfs_write_file+0xef/0x170 > [ 36.351274] [<ffffffff811668d3>] vfs_write+0xb3/0x180 > [ 36.351286] [<ffffffff81166bfa>] sys_write+0x4a/0x90 > [ 36.351300] [<ffffffff81585d02>] system_call_fastpath+0x16/0x1b > [ 36.351308] Code: ff 48 8b bd a0 fe ff ff 44 88 85 78 fe ff ff e8 > 5d fb ff ff 44 0f b6 85 78 fe ff ff 0f 1f 44 00 00 49 8b 7d 10 4c 8b > 4d 98 31 d2 <8b> 4f 04 4c 89 c8 48 c1 e0 0a 48 f7 f1 48 8b 4d a0 48 > 85 c9 48 > [ 36.351435] RIP [<ffffffff81055f9a>] find_busiest_group+0x38a/0xbb0 > [ 36.351450] RSP <ffff880002b71228> > [ 36.351456] CR2: 0000000000000004 > [ 36.351465] ---[ end trace 5ad2b14b3a9050ae ]--- > [ 36.352362] BUG: unable to handle kernel NULL pointer dereference > at 0000000000000010 > [ 36.352379] IP: [<ffffffff812ba531>] rb_next+0x1/0x50 > [ 36.352394] PGD 0 > [ 36.352402] Oops: 0000 [#2] SMP > [ 36.352411] CPU 1 > [ 36.352416] Modules linked in: xt_mac ipt_MASQUERADE > ebtable_filter ebtables iscsi_scst(O) xt_tcpudp scst_vdisk(O) > xt_state crc32c xt_multiport libcrc32c iptable_filter iptable_nat > nf_nat nf_conntrack_ipv4 nf_conntrack scst_cdrom(O) nf_defrag_ipv4 > ip_tables scst(O) x_tables bridge stp llc nls_cp437 isofs zram(C) > snd_hda_codec_hdmi snd_hda_codec_conexant microcode arc4 psmouse > serio_raw i915 drm_kms_helper drm iwlwifi(O) mac80211(O) cfg80211(O) > thinkpad_acpi nvram snd_hda_intel snd_hda_codec snd_hwdep snd_pcm > snd_timer snd soundcore snd_page_alloc i2c_algo_bit intel_agp video > intel_gtt tpm_tis tpm tpm_bios sdhci_pci sdhci ehci_hcd e1000e > [ 36.352573] > [ 36.352580] Pid: 2730, comm: bash Tainted: G D C O > 3.2.23-orc #19 LENOVO 42404EU/42404EU > [ 36.352596] RIP: e030:[<ffffffff812ba531>] [<fffffff > > > > > 3) > > [ 47.833362] Resuming Xen processor info > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > (XEN) microcode: collect_cpu_info : sig=0x206a6, pf=0x10, rev=0x28 > [ 47.886297] Enabling non-boot CPUs ... > [ 47.890082] installing Xen timer for CPU 1 > [ 47.894257] cpu 1 spinlock event irq 48 > [ 47.899013] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000008 > [ 47.906740] IP: [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100 > [ 47.913578] PGD 34a4067 PUD 3ac3067 PMD 0 > [ 47.917825] Oops: 0000 [#1] SMP > [ 47.921108] Modules linked in: ipt_MASQUERADE ebtable_filter ebtables > iscsi_scst(O) xt_tcpudp xt_state xt_multiport iptable_filter scst_vdisk(O) > iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack > scst_cdrom(O) ip_tables scst(O) x_tables nls_cp437 isofs bridge stp llc > zram(C) zsmalloc(C) hid_generic usbhid hid coretemp crc32c_intel > ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw aes_x86_64 xts > gf128mul microcode psmouse serio_raw arc4 iwldvm mac80211 i915 drm_kms_helper > drm iwlwifi intel_agp i2c_algo_bit cfg80211 intel_gtt video ahci libahci > e1000e [last unloaded: tpm_bios] > [ 47.974636] CPU 0 > [ 47.976456] Pid: 2468, comm: pm-suspend Tainted: G C O 3.8.0-orc > #19 Intel Corporation SandyBridge Platform/Emerald Lake > [ 47.988310] RIP: e030:[<ffffffff8149196b>] [<ffffffff8149196b>] > __cpuidle_register_device+0x2b/0x100 > [ 47.997605] RSP: e02b:ffff880025685c98 EFLAGS: 00010286 > [ 48.002970] RAX: 0000000000000000 RBX: ffff88002de40000 RCX: > 0000000000000000 > [ 48.010154] RDX: ffff880025685fd8 RSI: 0000000000000007 RDI: > ffff88002de40000 > [ 48.017336] RBP: ffff880025685cb8 R08: 0000000000021120 R09: > 0000000000000000 > [ 48.024520] R10: 0000000000000030 R11: 0000000000000000 R12: > ffff88002de40000 > [ 48.031742] R13: 00000000ffffffde R14: 00000000ffffffea R15: > 0000000000000000 > [ 48.038927] FS: 00007fb599d0e700(0000) GS:ffff88002de00000(0000) > knlGS:0000000000000000 > [ 48.047060] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 48.052859] CR2: 0000000000000008 CR3: 000000000345b000 CR4: > 0000000000002660 > [ 48.060043] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 48.067223] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 48.074450] Process pm-suspend (pid: 2468, threadinfo ffff880025684000, > task ffff880003558000) > [ 48.083102] Stack: > [ 48.085179] ffff88002de40000 ffff88002de40000 00000000ffffffde > ffffffff81a6b480 > [ 48.092622] ffff880025685cd8 ffffffff81491cc1 0000000000000001 > ffff88002de40000 > [ 48.100064] ffff880025685cf8 ffffffff813046df 0000000000000001 > 0000000000000001 > [ 48.107517] Call Trace: > [ 48.110029] [<ffffffff81491cc1>] cpuidle_register_device+0x31/0x80 > [ 48.116348] [<ffffffff813046df>] intel_idle_cpu_init+0xbf/0x120 > [ 48.122423] [<ffffffff813047b0>] cpu_hotplug_notify+0x70/0x80 > [ 48.128310] [<ffffffff815a619d>] notifier_call_chain+0x4d/0x70 > [ 48.134281] [<ffffffff8107969e>] __raw_notifier_call_chain+0xe/0x10 > [ 48.140686] [<ffffffff81053bb0>] __cpu_notify+0x20/0x40 > [ 48.146050] [<ffffffff81594c7c>] _cpu_up+0xf1/0x138 > [ 48.151070] [<ffffffff8158ab39>] enable_nonboot_cpus+0x99/0xd0 > [ 48.157090] [<ffffffff81097b8d>] suspend_devices_and_enter+0x25d/0x330 > [ 48.163752] [<ffffffff81097def>] pm_suspend+0x18f/0x1f0 > [ 48.169117] [<ffffffff81096dea>] state_store+0x8a/0x100 > [ 48.174483] [<ffffffff812ac29f>] kobj_attr_store+0xf/0x30 > [ 48.180022] [<ffffffff811c005f>] sysfs_write_file+0xef/0x170 > [ 48.185943] [<ffffffff8115c253>] vfs_write+0xb3/0x180 > [ 48.191056] [<ffffffff8115c592>] sys_write+0x52/0xa0 > [ 48.196160] [<ffffffff815a614e>] ? do_page_fault+0xe/0x10 > [ 48.201700] [<ffffffff815aa7d9>] system_call_fastpath+0x16/0x1b > [ 48.207758] Code: 66 66 66 66 90 55 48 89 e5 48 83 ec 20 48 89 5d e0 4c 89 > 6d f0 48 89 fb 4c 89 75 f8 4c 89 65 e8 41 be ea ff ff ff e8 75 0a 00 00<48> > 8b 78 08 49 89 c5 e8 19 80 c1 ff 84 c0 74 53 8b 43 04 49 c7 > [ 48.226658] RIP [<ffffffff8149196b>] __cpuidle_register_device+0x2b/0x100 Hm, that is suspect. There should not be any cpuidle_register? Perhaps you are .. ah yes, you are hitting a bug that should be in the stable tree fix. Here is the git commit b88a634a903d9670aa5f2f785aa890628ce0dece and 6f8c2e7933679f54b6478945dc72e59ef9a3d5e0 > [ 48.233582] RSP<ffff880025685c98> > [ 48.237131] CR2: 0000000000000008 > > [ 48.240521] ---[ end trace 535ebe28cd06b143 ]--- > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |