[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution



On 05/01/2018 09:39, Juergen Gross wrote:
> On 05/01/18 10:26, Andrew Cooper wrote:
>> On 05/01/2018 07:48, Juergen Gross wrote:
>>> On 04/01/18 21:21, Andrew Cooper wrote:
>>>> This work was developed as an SP3 mitigation, but shelved when it became 
>>>> clear
>>>> that it wasn't viable to get done in the timeframe.
>>>>
>>>> To protect against SP3 attacks, most mappings needs to be flushed while in
>>>> user context.  However, to protect against all cross-VM attacks, it is
>>>> necessary to ensure that the Xen stacks are not mapped in any other cpus
>>>> address space, or an attacker can still recover at least the GPR state of
>>>> separate VMs.
>>> Above statement is too strict: it would be sufficient if no stacks of
>>> other domains are mapped.
>> Sadly not.  Having stacks shared by domain means one vcpu can still
>> steal at least GPR state from other vcpus belonging to the same domain.
>>
>> Whether or not a specific kernel cares, some definitely will.
>>
>>> I'm just working on a proof of concept using dedicated per-vcpu stacks
>>> for 64 bit pv domains. Those stacks would be mapped in the per-domain
>>> region of the address space. I hope to have a RFC version of the patches
>>> ready next week.
>>>
>>> This would allow to remove the per physical cpu mappings in the guest
>>> visible address space when doing page table isolation.
>>>
>>> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
>>> we could extend the pv ABI to mark a guest's user L4 page table as
>>> "single use", i.e. not allowed to be active on multiple vcpus at the
>>> same time (introducing that ABI modification in the Linux kernel would
>>> be simple, as the Linux kernel currently lacks support for cross-cpu
>>> stack exploits and when that support is being added by per-cpu L4 user
>>> page tables we could just chime in). A L4 page table marked as "single
>>> use" would map the local vcpu stacks only.
>> For PV guests, it is the Xen stacks which matter, not the vcpu guest
>> kernel's ones.
> Indeed. That's the reason I want to have per-vcpu Xen stacks.

We will have to be extra careful going along those lines (and to
forewarn you, I don't have a good gut feeling about it).

For one, livepatching safety currently depends on the per-pcpu stacks. 
Also, you will have to entirely rework how the IST stacks work, as they
will have to move to being per-vcpu as well, which means modifying the
TSS and rewriting the syscall stubs on context switch.

At the moment, Xen's per-pcpu stacks have shielded us some of the
SP2/RSB issues, because of reset_stack_and_jump() used during
scheduling.  The waitqueue infrastructure is the one place where this is
violated at the moment, and is only used in practice during
introspection.  However, for other reasons, I'm looking to delete that
code and pretend that it never existed.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.