[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] x86/xpti: really hide almost all of Xen image



>>> On 02.03.18 at 18:23, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 02/03/18 17:04, Jan Beulich wrote:
>>>>> On 02.03.18 at 17:53, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 02/03/18 14:34, Jan Beulich wrote:
>>>> Note that the removed BUILD_BUG_ON()s don't get replaced by anything -
>>>> there already is a suitable ASSERT() in xen.lds.S.
>>> This isn't quite true.  You've changed the mechanism by which the stubs
>>> get mapped (from entirely common, to per-pcpu), removing the need for
>>> the BUILD_BUG_ON().
>>>
>>> The ASSERT() in xen.lds.S serves a different purpose, checking that the
>>> sum total of stubs don't overlap with the compiled code.  (On this
>>> note... do we perform the same check for livepatches?  I can't spot
>>> anything.)
>> What you say may be true for the one that was in
>> setup_cpu_root_pgt(), but surely not the one I'm removing from
>> alloc_stub_page(). But I can drop this if you prefer.
> 
> I think it might avoid some confusion.

Okay. Do you have any other comments on _this_ patch then?
The TSS related discussion below would result in another one
anyway.

>>>> What should we do with the TSS? Currently together with it we expose
>>>> almost a full page of other per-CPU data. A simple (but slightly
>>>> hackish) option would be to use one of the two unused stack slots.
>>> In 64bit, the TSS can be mapped read-only, because hardware never has
>>> cause to write to it.
>>>
>>> I believe that Linux now uses a read-only TSS mapping to double as a
>>> guard page for the trampoline stack, which is a less hacky way of
>>> thinking about it.
>>>
>>> However, doing that in Xen would mean shattering the directmap
>>> superpages in all cases, and we'd inherit the SVM triple fault case into
>>> release builds.  A different alternative (and perhaps simpler to
>>> backport) might be to have .bss.percpu.page_aligned and use that to hide
>>> the surrounding data.
>> Well, yes, that's obviously an option, but pretty wasteful. I'd then
>> be tempted to at least do some sharing of the page similar to how
>> the stubs of several CPUs share a single page.
> 
> For backport to older releases?
> 
> I think the extra almost 4k per pcpu isn't going to concern people (its
> the least of their problems right now), and there is a very tangible
> benefit of not leaking the other surrounding data.

Well, yes and no. I can see why people wouldn't be overly
concerned, but otoh I pretty much dislike today's general attitude
of such sort of wasteful use of resources. This isn't limited to
the software world, plus, I'm sorry, but I've grown up with less-
than-1Mb environments.

>>> Thinking about it, we've got the same problem with the TSS as the BSP
>>> IDT, if the link order happens to cause init_tss to cross a page boundary.
>> I don't think so, no - the structure is 128 bytes in size and 128
>> byte aligned. When I created the original XPTI light patch I did
>> specifically check.
> 
> This only happens by chance, because sizeof(struct tss_struct) ==
> SMP_CACHE_BYTES

Sort of true; the comment next to __cacheline_filler[] says so otoh.

> If we intend to rely on this behaviour, we want something like this:
> 
> diff --git a/xen/include/asm-x86/processor.h
> b/xen/include/asm-x86/processor.h
> index 9c70a98..fe647dc 100644
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -385,7 +385,7 @@ static always_inline void __mwait(unsigned long eax,
> unsigned long ecx)
>  #define IOBMP_BYTES             8192
>  #define IOBMP_INVALID_OFFSET    0x8000
>  
> -struct __packed __cacheline_aligned tss_struct {
> +struct __packed tss_struct {
>      uint32_t :32;
>      uint64_t rsp0, rsp1, rsp2;
>      uint64_t :64;
> @@ -398,7 +398,7 @@ struct __packed __cacheline_aligned tss_struct {
>      uint16_t :16, bitmap;
>      /* Pads the TSS to be cacheline-aligned (total size is 0x80). */
>      uint8_t __cacheline_filler[24];
> -};
> +} __aligned(sizeof(struct tss_struct));
>  
>  #define IST_NONE 0UL
>  #define IST_DF   1UL
> 
> except that C can't cope with this expression.  I wonder if there is an
> alternate way with typedefs.

Typedefs can't possibly help, as they're fully equivalent with the
types they refer to. Wrapping another structure around the "raw"
one might help.

But perhaps a BUILD_BUG_ON() would do? Or even just replacing 24
by SMP_CACHE_BYTES - 104 (yielding a negative array size should
SMP_CACHE_BYTES ever shrink for whatever reason), plus perhaps
adding __cacheline_aligned.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.