[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Remaining MSR wrinkles



On 10.03.2021 20:09, Ian Jackson wrote:
> (I bounced Roger's original mail to xen-devel.  I hope it made it...)
> 
> Roger Pau Monné writes ("Remaining MSR wrinkles"):
>> 1. MSR behavior for PV guests without a #GP handler set: PV Linux versions <
>>    4.14 will use rdmsr_safe (and likely wrmsr_safe?) without having a #GP
>>    handler setup, which results in a crash. This bug was hidden in previous
>>    Xen releases by allowing unlimited read access to the MSR space.

I've not observed problematic wrmsr_safe() so far.

>>    Jan has posted several proposals to address this:
>>
>>    
>> https://lore.kernel.org/xen-devel/7e69db81-cee7-3c7b-be64-4f5ff50fbe9c@xxxxxxxx/
>>    
>> https://lore.kernel.org/xen-devel/d794bbee-a5e5-6632-3d1f-acd8164b7171@xxxxxxxx/
>>
>>    Which all rely on the fact that for PV guests Xen knows whether there's a
>>    #GP handler setup and can hence prevent injection of a #GP fault if no
>>    handler is present.
>>
>>    Andrew opinion is that we should instead try to figure out which MSRs the
>>    buggy Linux versions try to access and special case them. Andrew also 
>> raised
>>    the point that continue running with a 'fake' (ie: 0) MSR value might be
>>    worse than crashing.
>>
>>    Part of the discussion has also happened here:
>>
>>    
>> https://lore.kernel.org/xen-devel/4da62f0b-8a08-dd84-2040-fd55d74fd85a@xxxxxxxxxx/
>>
>>    Look for the last quote.
>>
>>    Another option is to document that PV Linux < 4.14 will require 
>> msr_relaxed=1
>>    in order to run. That as Jan pointed out will also imply PV Linux to run 
>> with
>>    a faked (0) MSR value instead of crashing.
> 
>> For 1. I do agree with Jan than this workaround is likely the best option we
>> have, sort of resorting to request enabling msr_relaxed for all Linux PV 
>> guests
>> < 4.14. Whether we want to limit this workaround to the read side only I'm 
>> not
>> fully convinced. There's something nice about having symmetry in the read and
>> write paths, but if all the calls we have identified are rdmsr only I prefer 
>> to
>> leave the write path unaltered and request users to use msr_relaxed if write
>> issues are found later.

Especially if Andrew's ambiguous objection was against the write side
only, I think I'd prefer to go with this latter variant. Considering
that dealing with the read side alone is sufficient to address the
observed issue, I'm even inclined to prefer this irrespective of that
constraint.

> Thanks.  I find your explanation and arguments convincing.  I have
> read what Andy says in that link and I find that less convincing.  In
> particular "I don't think we should legitimise buggy PV behaviour" is
> not entirely consistent with our previous approach to
> bug-compatibility and support for old guests.
> 
> Accordingly, (with committer tie-breaker hat on) I would prefer to
> apply the patches from Jan.  I don't have an opinion about the read vs
> write question, and will probably be happy with whatever you and Jan
> can agree on.
> 
> I don't think I Release-Acked those patches yet so, for those two,
> 
> Release-Acked-by: Ian Jackson <iwj@xxxxxxxxxxxxxx>

You didn't, indeed, but "those two" is slightly confusing, the two
links Roger did provide are just different versions of the same
patch. Hence I'd like to double check that it is exactly this one
patch of mine (of which I need to send another version, at least
to include Roger's requested documentation of the behavior, and
possibly also the write side equivalent - still waiting for Andrew
to come back and clarify the scope of his objection).

>> 2. RAPL_POWER_LIMIT: handle the MSR explicitly to make Solaris happy.
>>    Alternatively we can list in the release notes that Solaris guests require
>>    msr_relaxed=1. Andrew is working on a patch for this.
> 
> I would prefer to handle the MSR explicitly, for the same
> compatibility reason as above.

The question is here - we aren't sure yet that this is the only
one, are we? Andrew suspects if this one MSR gets accessed this
way, then likely others will be, too. Boris, can you tell for sure
either way?

Also, Boris - any chance you could give your Tested-by for Roger's
patch? It's otherwise ready to go in, but I'd prefer to commit it
knowing that you've tested this hopefully final version. I'm sorry
for the recurring requests to test this workaround.

>> 3. MSR_K8_HWCR: Linux will complain about a missing bit (TscFreqSel). Jan
>>    posted a patch to handle the MSR and unconditionally set the bit:
>>
>>    
>> https://lore.kernel.org/xen-devel/c91b190a-aaa1-d3b8-10eb-d8da7ad1f834@xxxxxxxx/
>>
>>    There are concerns from Andrew about missing bits in ACPI tables and 
>> Pstate
>>    MSRs if this bit is reported as set.
> 
>> For 3. I think I at least need more details about the relation with 
>> TscFreqSel
>> and ACPI or other MSRs, and I haven't been able to find it on the PPRs I
>> looked.
> 
> I don't understsnd the implications here.  Jan says that Linux has
> been warning about this, but is that the only symptom of the current
> state of affairs ?

Yes. It claiming a firmware bug when running under Xen, but not when
running natively is likely to cause support calls.

> Jan writes
> 
>  The main risk with making the read not fault here is that guests
>  might imply [sic, "infer" intended] they can also write this MSR then.
> 
> How would that matter, in practice ?  The guest would try to write
> the KSR and then ... ?

... would crash. I should say though that with increasing awareness of
potentially running virtualized, kernels have become more careful in
this regard. (I was about to say "with implications like this one, but
since you say "imply" was wrong, "implications" would likely be as well.
What would the correct noun here be then? My dictionary has "inference",
but this feels a little odd, too much like misspelled "interference".)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.