[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] further post-Meltdown-bad-aid performance thoughts



>>> On 22.01.18 at 13:33, <george.dunlap@xxxxxxxxxx> wrote:
> On 01/22/2018 09:25 AM, Jan Beulich wrote:
>>>>> On 19.01.18 at 18:00, <george.dunlap@xxxxxxxxxx> wrote:
>>> On 01/19/2018 04:36 PM, Jan Beulich wrote:
>>>>>>> On 19.01.18 at 16:43, <george.dunlap@xxxxxxxxxx> wrote:
>>>>> So what if instead of trying to close the "windows", we made it so that
>>>>> there was nothing through the windows to see?  If no matter what the
>>>>> hypervisor speculatively executed, nothing sensitive was visibile except
>>>>> what a vcpu was already allowed to see,
>>>>
>>>> I think you didn't finish your sentence here, but I also think I
>>>> can guess the missing part. There's a price to pay for such an
>>>> approach though - iterating over domains, or vCPU-s of a
>>>> domain (just as an example) wouldn't be simple list walks
>>>> anymore. There are certainly other things. IOW - yes, and
>>>> approach like this seems possible, but with all the lost
>>>> performance I think we shouldn't go overboard with further
>>>> hiding.
>>>
>>> Right, so the next question: what information *from other guests* are
>>> sensitive?
>>>
>>> Obviously the guest registers are sensitive.  But how much of the
>>> information in vcpu struct that we actually need to have "to hand" is
>>> actually sensitive information that we need to hide from other VMs?
>> 
>> None, I think. But that's not the main aspect here. struct vcpu
>> instances come and go, which would mean we'd have to
>> permanently update what is or is not being exposed in the page
>> tables used. This, while solvable, is going to be a significant
>> burden in terms of synchronizing page tables (if we continue to
>> use per-CPU ones) and/or TLB shootdown. Whereas if only the
>> running vCPU's structure (and it's struct domain) are exposed,
>> no such synchronization is needed (things would simply be
>> updated during context switch).
> 
> I'm not sure we're actually communicating.
> 
> Correct me if I'm wrong; at the moment, under XPTI, hypercalls running
> under Xen still have access to all of host memory.  To protect against
> SP3, we remove almost all Xen memory from the address space before
> switching to the guest.
> 
> What I'm proposing is something like this:
> 
> * We have a "global" region of Xen memory that is mapped by all
> processors.  This will contain everything we consider not sensitive;
> including Xen text segments, and most domain and vcpu data.  But it will
> *not* map all of host memory, nor have access to sensitive data, such as
> vcpu register state.
> 
> * We have per-cpu "local" regions.  In this region we will map,
> on-demand, guest memory which is needed to perform current operations.
> (We can consider how strictly we need to unmap memory after using it.)
> We will also map the current vcpu's registers.
> 
> * On entry to a 64-bit PV guest, we don't change the mapping at all.
> 
> Now, no matter what the speculative attack -- SP1, SP2, or SP3 -- a vcpu
> can only access its own RAM and registers.  There's no extra overhead to
> context switching into or out of the hypervisor.

And we would open back up the SP3 variant of guest user mode
attacking its own kernel by going through the Xen mappings. I
can't exclude that variants of SP1 (less likely SP2) allowing indirect
guest-user -> guest-kernel attacks couldn't be found.

> Given that, I don't understand what the following comments mean:
> 
> "There's a price to pay for such an approach though - iterating over
> domains, or vCPU-s of a domain (just as an example) wouldn't be simple
> list walks anymore."
> 
> If we remove sensitive information from the domain and vcpu structs,
> then any bit of hypervisor code can iterate over domain and vcpu structs
> at will; only if they actually need to read or write sensitive data will
> they have to perform an expensive map/unmap operation.  But in general,
> to read another vcpu's registers you already need to do a vcpu_pause() /
> vcpu_unpause(), which involves at least two IPIs (with one
> spin-and-wait), so it doesn't seem like that should add a lot of extra
> overhead.

Reading another vCPU-s register can't be compared with e.g.
wanting to deliver an interrupt to other than the currently running
vCPU.

> "struct vcpu instances come and go, which would mean we'd have to
> permanently update what is or is not being exposed in the page tables
> used. This, while solvable, is going to be a significant burden in terms
> of synchronizing page tables (if we continue to use per-CPU ones) and/or
> TLB shootdown."
> 
> I don't understand what this is referring to in my proposed plan above.

I had specifically said these were just examples (ones coming to
mind immediately). Of course splitting such structures in two parts
is an option, but I'm not sure it's a reasonable one (which perhaps
depends on details on how you would envision the implementation).
If the split off piece(s) was/were being referred to by pointers out
of the main structure, there would be a meaningful risk of some
perhaps rarely executed piece of code de-referencing it in the
wrong context. Otoh entirely independent structures (without
pointers in either direction) would need careful management of
their life times, so one doesn't go away without the other.

You mention the possibility of on demand mapping - if data
structures aren't used frequently, that's certainly an option.
In the end there's a lot of uncertainty here whether the in theory
nice outline could actually live up to the requirements of an
actual implementation. Yet considering the (presumably)
fundamental re-structuring of data which would be required
here calls for at least some of this uncertainty to be addressed
before actually making an attempt to switch over to such a
model.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.