[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/2] x86/xpti: Don't leak TSS-adjacent percpu data via Meltdown


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 29 Jul 2019 16:55:02 +0100
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 29 Jul 2019 15:55:10 +0000
  • Ironport-sdr: fB/BfaQK5J5xqfCSFj4s7VLzc1g2raenaGwsOfMaT4MiQuoAuP5hgT9bLe7C3M8cqO+H3fUf1j lhVux1Jk+tB3KPITFolROtMBrH06qxQgOLK8PMBhJqt9YED9JmVg8N72YZKcyIYNyLqB7ArwSN qDH23DjvJsmw0zG+xoLiW3OB+pPswaXchfW0mMNIB46Jf0Wt8PI5RcNicc99IPyhJDVdbnoHk3 0W+cegObA7O2763UUAb1ThxRJINfedmwljS1NrZcM7HeXM+hyjkPb/MvLq2/hPV4ozyXa5/r/U OJ8=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 29/07/2019 14:51, Jan Beulich wrote:
> On 26.07.2019 22:32, Andrew Cooper wrote:
>> The XPTI work restricted the visibility of most of memory, but missed a few
>> aspects when it came to the TSS.
> None of these were "missed" afair - we'd been aware, and accepted things
> to be the way they are now for the first step. Remember that at the time
> XPTI was called "XPTI light", in anticipation for this to just be a
> temporary solution.

Did the term "XPTI light" survive past the first RFC posting?

Sure - we did things in an incremental way because it was a technically
complex change and Meltdown was out in the wild at the time.

However, I would have fixed this at the same time as .entry.text if I
had noticed, because the purpose of that series was identical to this
series - avoid leaking things we don't absolutely need to leak.

>> Given that the TSS is just an object in percpu data, the 4k mapping for it
>> created in setup_cpu_root_pgt() maps adjacent percpu data, making it all
>> leakable via Meltdown, even when XPTI is in use.
>>
>> Furthermore, no care is taken to check that the TSS doesn't cross a page
>> boundary.  As it turns out, struct tss_struct is aligned on its size which
>> does prevent it straddling a page boundary, but this will cease to be true
>> once CET and Shadow Stack support is added to Xen.
> Please can you point me at the CET aspect in documentation here? Aiui
> it's only task switches which are affected, and hence only 32-bit TSSes
> which would grow (and even then not enough to exceed 128 bytes). For
> the purposes 64-bit has there are MSRs to load SSP from.

Ah - it was v1 of the CET spec.  I see v3 no longer has the shadow stack
pointer in the TSS.

I'll drop this part of the message.

>> --- a/xen/include/asm-x86/processor.h
>> +++ b/xen/include/asm-x86/processor.h
>> @@ -411,7 +411,7 @@ static always_inline void __mwait(unsigned long eax, 
>> unsigned long ecx)
>>   #define IOBMP_BYTES             8192
>>   #define IOBMP_INVALID_OFFSET    0x8000
>>   
>> -struct __packed __cacheline_aligned tss_struct {
>> +struct __packed tss_struct {
>>       uint32_t :32;
>>       uint64_t rsp0, rsp1, rsp2;
>>       uint64_t :64;
>> @@ -425,6 +425,7 @@ struct __packed __cacheline_aligned tss_struct {
>>       /* Pads the TSS to be cacheline-aligned (total size is 0x80). */
>>       uint8_t __cacheline_filler[24];
>>   };
>> +DECLARE_PER_CPU(struct tss_struct, init_tss);
> Taking patch 1 this expands to
>
>      __DEFINE_PER_CPU(__section(".bss.percpu.page_aligned") \
>                       __aligned(PAGE_SIZE), struct tss_struct, _init_tss);
>
> and then
>
>      __section(".bss.percpu.page_aligned") __aligned(PAGE_SIZE)
>      __typeof__(struct tss_struct) per_cpu__init_tss;
>
> which is not what you want: You have an object of size
> sizeof(struct tss_struct) which is PAGE_SIZE aligned. Afaict you
> therefore still leak everything that follows in the same page.

What data might this be?

Every object put into this section is suitably aligned, so nothing will
sit in the slack between the TSS and the end of the page.

> There was a reason for __cacheline_aligned's original placement, albeit I
> agree that it was/is against the intention of having the struct
> define an interface to the hardware (which doesn't have such an
> alignment requirement).

There is a hard requirement to have the first 104 bytes be physically
contiguous, because on a task switch, some CPUs translate the TSS base
and offset directly from there.

I expect that is where the __cacheline_aligned(), being 128, comes in.

However, the manual also makes it clear that this is only on a task
switch, which is inapplicable for us.

Finally, were we to put a structure like this on the stack (e.g. like
hvm_task_switch() does with tss32), we specifically wouldn't want any
unnecessary alignment.

> Perhaps the solution is a two-layer approach:
>
> struct __aligned(PAGE_SIZE) xen_tss {
>      struct __packed tss_struct {
>          ...
>      };
> };
>
> where the inner structure describes the hardware interface and the
> containing one our own requirement(s). But personally I also
> wouldn't mind putting the __aligned(PAGE_SIZE) right on struct
> tss_struct, where __cacheline_aligned has been sitting.

The only way that would make things more robust is if xen_tss was a
union with char[4096] to extend its size.

However, I think this is overkill, given the internals of
DEFINE_PER_CPU_PAGE_ALIGNED()

> Of course either approach goes against the idea of avoiding usage
> mistakes (as pointed out by others in the v1 discussion, iirc):
> There better wouldn't be a need to get the two "page aligned"
> attributes in sync, i.e. the instantiation of the structure
> would better enforce the requested alignment. I've not thought
> through whether there's trickery to actually make this work, but
> I'd hope we could at the very least detect things not being in
> sync at compile time.

There is a reason why I put in a linker assertion for the TSS being
non-aligned.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.