[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Yet another S3 issue in Xen 4.14
On Thu, Oct 01, 2020 at 01:43:52PM +0100, Andrew Cooper wrote: > On 01/10/2020 13:31, Marek Marczykowski-Górecki wrote: > > On Thu, Oct 01, 2020 at 01:59:32PM +0200, Jan Beulich wrote: > >> On 01.10.2020 03:12, Marek Marczykowski-Górecki wrote: > >>> After patching the previous issue ("x86/S3: Fix Shadow Stack resume > >>> path") I still encounter issues resume from S3. > >>> Since I had it working on Xen 4.13 on this particular hardware (Thinkpad > >>> P52), I bisected it and got this: > >>> > >>> commit 4304ff420e51b973ec9eb9dafd64a917dd9c0fb1 > >>> Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > >>> Date: Wed Dec 11 20:59:19 2019 +0000 > >>> > >>> x86/S3: Drop {save,restore}_rest_processor_state() completely > >>> > >>> There is no need to save/restore FS/GS/XCR0 state. It will be handled > >>> suitably on the context switch away from the idle. > >>> > >>> The CR4 restoration in restore_rest_processor_state() was actually > >>> fighting > >>> later code in enter_state() which tried to keep CR4.MCE clear until > >>> everything > >>> was set up. Delete the intermediate restoration, and defer final > >>> restoration > >>> until after MCE is reconfigured. > >>> > >>> Restoring PAT can be done earlier, and ideally before paging is > >>> enabled. By > >>> moving it into the trampoline during the setup for 64bit, the call > >>> can be > >>> dropped from cpu_init(). The EFI path boot path doesn't disable > >>> paging, so > >>> make the adjustment when switching onto Xen's pagetables. > >>> > >>> The only remaing piece of restoration is load_system_tables(), so > >>> suspend.c > >>> can be deleted in its entirety. > >>> > >>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > >>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > >>> > >>> Parent of this commit suspends and resumes just fine. With this commit > >>> applied, it (I think) it panics, at least I get reboot after 5s. Sadly, I > >>> don't have serial console there. > >>> > >>> I tried also master and stable-4.14 with this commit reverted (and also > >>> the other fix applied), but it doesn't work. In this case I get a hang on > >>> resume (power led still flashing, but fan woke up). There are probably > >>> some other dependencies. > >> Since bisection may also point you at some intermediate breakage, which > >> these last results of yours seem to support, could you check whether > >> 55f8c389d434 put immediately on top of the above commit makes a difference, > >> and if so resume bisecting from there? > > Nope, 4304ff420e51b973ec9eb9dafd64a917dd9c0fb1 with 55f8c389d434 on top > > it still hangs on resume. > > Ok. I'll see about breaking the change apart so we can bisect which > specific bit of code movement broke things. I've done another bisect on the commit broken up in separate changes (https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/dbg-s3) and the bad part seems to be this: From dbdb32f8c265295d6af7cd4cd0aa12b6d04a0430 Mon Sep 17 00:00:00 2001 From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Date: Fri, 2 Oct 2020 15:40:22 +0100 Subject: [PATCH 1/1] CR4 --- xen/arch/x86/acpi/power.c | 9 ++++----- xen/arch/x86/acpi/suspend.c | 3 --- 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c index 6dfd4c7891..0cda362045 100644 --- a/xen/arch/x86/acpi/power.c +++ b/xen/arch/x86/acpi/power.c @@ -195,7 +195,6 @@ static int enter_state(u32 state) unsigned long flags; int error; struct cpu_info *ci; - unsigned long cr4; if ( (state <= ACPI_STATE_S0) || (state > ACPI_S_STATES_MAX) ) return -EINVAL; @@ -270,15 +269,15 @@ static int enter_state(u32 state) system_state = SYS_STATE_resume; - /* Restore CR4 and EFER from cached values. */ - cr4 = read_cr4(); - write_cr4(cr4 & ~X86_CR4_MCE); + /* Restore EFER from cached value. */ write_efer(read_efer()); device_power_up(SAVED_ALL); mcheck_init(&boot_cpu_data, false); - write_cr4(cr4); + + /* Restore CR4 from cached value, now MCE is set up. */ + write_cr4(read_cr4()); printk(XENLOG_INFO "Finishing wakeup from ACPI S%d state.\n", state); diff --git a/xen/arch/x86/acpi/suspend.c b/xen/arch/x86/acpi/suspend.c index 060a9313b6..ca987d9019 100644 --- a/xen/arch/x86/acpi/suspend.c +++ b/xen/arch/x86/acpi/suspend.c @@ -23,7 +23,4 @@ void save_rest_processor_state(void) void restore_rest_processor_state(void) { load_system_tables(); - - /* Restore full CR4 (inc MCE) now that the IDT is in place. */ - write_cr4(mmu_cr4_features); } -- 2.20.1 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |