[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
>>> On 24.01.19 at 21:33, <andrew.cooper3@xxxxxxxxxx> wrote: > On 24/01/2019 12:07, Norbert Manthey wrote: >> On 1/23/19 14:20, Jan Beulich wrote: >>>>>> On 23.01.19 at 12:51, <nmanthey@xxxxxxxxx> wrote: >>>> --- a/xen/include/xen/nospec.h >>>> +++ b/xen/include/xen/nospec.h >>>> @@ -58,6 +58,21 @@ static inline unsigned long > array_index_mask_nospec(unsigned long index, >>>> (typeof(_i)) (_i & _mask); \ >>>> }) >>>> >>>> +/* >>>> + * allow to insert a read memory barrier into conditionals >>>> + */ >>>> +#ifdef CONFIG_X86 >>>> +static inline bool lfence_true(void) { rmb(); return true; } >>>> +#else >>>> +static inline bool lfence_true(void) { return true; } >>>> +#endif >>>> + >>>> +/* >>>> + * protect evaluation of conditional with respect to speculation >>>> + */ >>>> +#define evaluate_nospec(condition) \ >>>> + (((condition) && lfence_true()) || !lfence_true()) >>> It may be just me, but I think >>> >>> #define evaluate_nospec(condition) \ >>> ((condition) ? lfence_true() : !lfence_true()) >>> >>> would better express the two-way nature of this. >> I compared the binary output of the two variants, and they are the same >> (for my build environment). I'll switch to your variant, in case nobody >> objects. > > Is it safe though? The original variant is required by C to only > evaluate one of the lfence_true() blocks, whereas the second variation > could execute both of them and cmov the 1 and 0 together, which is wasteful. For one I don't understand the connection between the initial question and the explanation boiling down to a performance concern. But I think I see what safety concern you may have. And then I'm having difficulty following your interpretation of what evaluation requirements C imposes: &&, ||, and ?: are all sequence points. I'm implying from this that, as long as the evaluation of the expressions has no side effects, it can happen irrespective of source arrangements, and in particular the compiler could translate both into exactly the same code. Neither variant excludes the asm() getting moved by the compiler anyway, despite the volatile - this is what the gcc doc has to say on the topic: "Note that the compiler can move even volatile asm instructions relative to other code, including across jump instructions." It then goes on to explain how this can be improved in source; I wonder whether we may want to follow that advice and add a dependency on the calculated branch condition. But of course this may further impact performance. What's worse, "Under certain circumstances, GCC may duplicate (or remove duplicates of) your assembly code when optimizing" suggests to me that neither of the two variants are really safe from getting converted to code actually matching the behavior of L1TF_LFENCE_INTERMEDIATE. Do we perhaps need to further complicate things and use (using naming derived from the current version) static inline bool lfence_bool(bool cond) { asm volatile ( "lfence" : "+X" (cond) ); return cond; } #define evaluate_nospec(condition) ({ \ bool cond_ = (condition); \ ((cond_) ? lfence_bool(cond_) : !lfence_bool(cond_)) Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |