[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86: store cr4 during suspend/resume



On 11.12.2019 20:36, Andrew Cooper wrote:
> On 10/12/2019 10:07, Jan Beulich wrote:
>> On 10.12.2019 10:59, Andrew Cooper wrote:
>>> On 09/12/2019 18:06, Roger Pau Monne wrote:
>>>> Currently cr4 is not cached before suspension, and mmu_cr4_features is
>>>> used in order to restore the expected cr4 value. This is correct so
>>>> far because the tasklet that executes the suspend/resume code is
>>>> running in the idle vCPU context.
>>>>
>>>> In order to make the code less fragile, explicitly save the current
>>>> cr4 value before suspension, so that it can be restored afterwards.
>>>> This ensures that the cr4 value cached in the cpu_info doesn't get out
>>>> of sync after resume from suspension.
>>>>
>>>> Suggested-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>> Why?  There is nothing fragile here.  Suspend/resume is always in idle
>>> context and loads of other logic already depends on this.
>>>
>>> I've been slowly stripping out redundant saved state like this.
>> Where it's clearly redundant, this is fine. But I don't think it's
>> sufficiently clear here
> 
> There is a reason I made it explicitly crystal clear with c/s 87e7b4d5b

Well, this makes clear we're in idle context, yes. But there's
still a disconnect between this and the use of mmu_cr4_features
(up to and including the somewhat misleading comment saying
"mmu_cr4_features contains latest cr4 setting" without it really
being clear what "latest" means, now that we run with varying
CR4 values. Yes, write_ptbase() does use the variable when
switching to idle, but the variable name lack any connection to
this fact.

>> , and going back to what was there before
>> is imo generally less error prone than going to some fixed state.
> 
> It is demonstrably more error prone, which is why I'm slowly killing it.
> 
> Stashing state wastes unnecessary space, and adds an error case where
> state is either stashed incorrectly, or gets modified before restore,
> and we'll blindly use.

The waste of space is entirely secondary here, I think. A value
getting modified before restore is no different from a value
going out of sync with the variable we reload from. It's a blind
use in either case.

> Two examples of real bugs caused by this are c/s 0c30171cb and 4ee0ad72d

I see your point for the former, but the latter seems to be unrelated.

> Absolutely nothing remaining in suspend.c should be spilled.  It can all
> be (re)caluclated from the same information source as the AP boot path,
> and the result will be strictly smaller in RAM, and more robust.

Robustness to me would imply using the same code for doing the
calculations, not re-calculating from the same information source.
This could be as simple as an idle_cr4() wrapper around the read
of mmu_cr4_features for the case at hand (suitably used wherever
applicable).

Anyway - together with your subsequent mail I accept your objections.
Once the code changes proposed there have gone in, I think it'll
become more clear to readers that indeed state saving/restoring is to
be the exception, not the rule (current code doesn't give this
impression, I think).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.