[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 4/4] x86/cpu-policy: Derive RSBA/RRSBA for guest policies
On 16/06/2023 1:12 pm, Jan Beulich wrote: > On 15.06.2023 20:17, Andrew Cooper wrote: >> On 15/06/2023 1:13 pm, Jan Beulich wrote: >>> On 15.06.2023 12:41, Andrew Cooper wrote: >>>> On 15/06/2023 9:30 am, Jan Beulich wrote: >>>>> On 14.06.2023 20:12, Andrew Cooper wrote: >>>>>> On 13/06/2023 10:59 am, Jan Beulich wrote: >>>>>>> On 12.06.2023 18:13, Andrew Cooper wrote: >>>>>>>> The RSBA bit, "RSB Alternative", means that the RSB may use alternative >>>>>>>> predictors when empty. From a practical point of view, this mean >>>>>>>> "Retpoline >>>>>>>> not safe". >>>>>>>> >>>>>>>> Enhanced IBRS (officially IBRS_ALL in Intel's docs, previously >>>>>>>> IBRS_ATT) is a >>>>>>>> statement that IBRS is implemented in hardware (as opposed to the form >>>>>>>> retrofitted to existing CPUs in microcode). >>>>>>>> >>>>>>>> The RRSBA bit, "Restricted-RSBA", is a combination of RSBA, and the >>>>>>>> eIBRS >>>>>>>> property that predictions are tagged with the mode in which they were >>>>>>>> learnt. >>>>>>>> Therefore, it means "when eIBRS is active, the RSB may fall back to >>>>>>>> alternative predictors but restricted to the current prediction mode". >>>>>>>> As >>>>>>>> such, it's stronger statement than RSBA, but still means "Retpoline >>>>>>>> not safe". >>>>>>>> >>>>>>>> CPUs are not expected to enumerate both RSBA and RRSBA. >>>>>>>> >>>>>>>> Add feature dependencies for EIBRS and RRSBA. While technically >>>>>>>> they're not >>>>>>>> linked, absolutely nothing good can come of letting the guest see RRSBA >>>>>>>> without EIBRS. Nor a guest seeing EIBRS without IBRSB. Furthermore, >>>>>>>> we use >>>>>>>> this dependency to simplify the max derivation logic. >>>>>>>> >>>>>>>> The max policies gets RSBA and RRSBA unconditionally set (with the >>>>>>>> EIBRS >>>>>>>> dependency maybe hiding RRSBA). We can run any VM, even if it has >>>>>>>> been told >>>>>>>> "somewhere you might run, Retpoline isn't safe". >>>>>>>> >>>>>>>> The default policies are more complicated. A guest shouldn't see both >>>>>>>> bits, >>>>>>>> but it needs to see one if the current host suffers from any form of >>>>>>>> RSBA, and >>>>>>>> which bit it needs to see depends on whether eIBRS is visible or not. >>>>>>>> Therefore, the calculation must be performed after >>>>>>>> sanitise_featureset(). >>>>>>>> >>>>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>>>>>> --- >>>>>>>> CC: Jan Beulich <JBeulich@xxxxxxxx> >>>>>>>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> >>>>>>>> CC: Wei Liu <wl@xxxxxxx> >>>>>>>> >>>>>>>> v3: >>>>>>>> * Minor commit message adjustment. >>>>>>>> * Drop changes to recalculate_cpuid_policy(). Deferred to a later >>>>>>>> series. >>>>>>> With this dropped, with the title not saying "max/default", and with >>>>>>> the description also not mentioning "live" policies at all, I don't >>>>>>> think this patch is self-consistent (meaning in particular: leaving >>>>>>> aside the fact that there's no way right now to requests e.g. both >>>>>>> RSBA and RRSBA for a guest; aiui it is possible for Dom0). >>>>>>> >>>>>>> As you may imagine I'm also curious why you decided to drop this. >>>>>> Because when I tried doing levelling in Xapi, I remembered why I did it >>>>>> the way I did in v1, and why the v2 way was wrong. >>>>>> >>>>>> Xen cannot safely edit what the toolstack provides, so must not. >>>>> And this is the part I don't understand: Why can't we correct the >>>>> (EIBRS,RSBA,RRSBA) tuple to a combination that is "legal"? At least >>>>> as long as ... >>>>> >>>>>> Instead, failing the set_policy() call is an option, and is what we want >>>>>> to do longterm, >>>>> ... we aren't there. >>>>> >>>>>> but also happens to be wrong too in this case. An admin >>>>>> may know that a VM isn't using retpoline, and may need to migrate it >>>>>> anyway for a number of reasons, so any safety checks need to be in the >>>>>> toolstack, and need to be overrideable with something like --force. >>>>> Possibly leading to an inconsistent policy exposed to a guest? I >>>>> guess this may be the only option when we can't really resolve an >>>>> ambiguity, but that isn't the case here, is it? >>>> Wrong. Xen does not have any knowledge of other hosts the VM might >>>> migrate to. >>>> >>>> So while Xen can spot problem combinations *on this host*, which way to >>>> correct the problem combination depends on where the VM might migrate to. >>> I actually view this as two different levels: With a flawed policy, the >>> guest is liable to not work correctly at all. No point thinking about >>> it being able to migrate. With a fixed up policy it may fail to migrate, >>> but it'll at least work otherwise. >> If you get RSBA and/or RRSBA wrong, nothing is going to malfunction in >> the guest, even if you migrate it. >> >> The consequence of getting RSBA and/or RRSBA wrong is the guest *might* >> think retpoline is safe to use, and *might* end up vulnerable to >> speculative attacks on this or other hardware. > Isn't that some sort of "malfunction"? Perhaps, there's a difference between "it will likely crash hard" and "you won't notice the difference". > >> And the admin might know that they overrode the default settings and >> forced the use of some other protection mechanism, so the guest is in >> fact safe despite having wrong RSBA/RRSBA settings. > But then the guest would also be safe with adjusted settings, wouldn't it? It doesn't mean the guest is going to tolerate having features change underfoot. > >> I don't know how to put it any more plainly. Xen *does not* have the >> information necessary to make a safety judgement in this matter. Only >> the toolstack (as a proxy for the admin) has the necessary information. > I'm not looking at it as Xen making things safe by adjusting bogus > settings. I'm merely looking at it as not letting a guest run that way. > For the safety aspect I agree it needs a wider view than Xen has. > > Anyway, I don't think either of us is going to convince the other of > there being only one way of looking at things vs there being at least > two possible ways, so in order to allow things to progress > Acked-by: Jan Beulich <jbeulich@xxxxxxxx> Thankyou. To be clear, we are planning to put checks in place. We definitely don't want the admin to end up in this corner case accidentally. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |