[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation



>>> On 24.01.19 at 21:33, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 24/01/2019 12:07, Norbert Manthey wrote:
>> On 1/23/19 14:20, Jan Beulich wrote:
>>>>>> On 23.01.19 at 12:51, <nmanthey@xxxxxxxxx> wrote:
>>>> --- a/xen/include/xen/nospec.h
>>>> +++ b/xen/include/xen/nospec.h
>>>> @@ -58,6 +58,21 @@ static inline unsigned long 
> array_index_mask_nospec(unsigned long index,
>>>>      (typeof(_i)) (_i & _mask);                                          \
>>>>  })
>>>>  
>>>> +/*
>>>> + * allow to insert a read memory barrier into conditionals
>>>> + */
>>>> +#ifdef CONFIG_X86
>>>> +static inline bool lfence_true(void) { rmb(); return true; }
>>>> +#else
>>>> +static inline bool lfence_true(void) { return true; }
>>>> +#endif
>>>> +
>>>> +/*
>>>> + * protect evaluation of conditional with respect to speculation
>>>> + */
>>>> +#define evaluate_nospec(condition)                                      \
>>>> +    (((condition) && lfence_true()) || !lfence_true())
>>> It may be just me, but I think
>>>
>>> #define evaluate_nospec(condition)                                      \
>>>     ((condition) ? lfence_true() : !lfence_true())
>>>
>>> would better express the two-way nature of this.
>> I compared the binary output of the two variants, and they are the same
>> (for my build environment). I'll switch to your variant, in case nobody
>> objects.
> 
> Is it safe though?  The original variant is required by C to only
> evaluate one of the lfence_true() blocks, whereas the second variation
> could execute both of them and cmov the 1 and 0 together, which is wasteful.

For one I don't understand the connection between the initial
question and the explanation boiling down to a performance
concern. But I think I see what safety concern you may have.

And then I'm having difficulty following your interpretation of
what evaluation requirements C imposes: &&, ||, and ?: are all
sequence points. I'm implying from this that, as long as the
evaluation of the expressions has no side effects, it can
happen irrespective of source arrangements, and in particular
the compiler could translate both into exactly the same code.

Neither variant excludes the asm() getting moved by the
compiler anyway, despite the volatile - this is what the gcc doc
has to say on the topic: "Note that the compiler can move even
volatile asm instructions relative to other code, including across
jump instructions." It then goes on to explain how this can be
improved in source; I wonder whether we may want to follow
that advice and add a dependency on the calculated branch
condition. But of course this may further impact performance.

What's worse, "Under certain circumstances, GCC may duplicate
(or remove duplicates of) your assembly code when optimizing"
suggests to me that neither of the two variants are really safe
from getting converted to code actually matching the behavior
of L1TF_LFENCE_INTERMEDIATE. Do we perhaps need to
further complicate things and use (using naming derived from
the current version)

static inline bool lfence_bool(bool cond) {
    asm volatile ( "lfence" : "+X" (cond) );
    return cond;
}

#define evaluate_nospec(condition) ({ \
    bool cond_ = (condition); \
    ((cond_) ? lfence_bool(cond_) : !lfence_bool(cond_))

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.