[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 7/7] xen/x86: use PCID feature



On 23/03/18 16:58, Jan Beulich wrote:
>>>> On 23.03.18 at 15:11, <jgross@xxxxxxxx> wrote:
>> On 23/03/18 14:46, Jan Beulich wrote:
>>> Valid point. Looking at all present uses of ->arch.cr3, it's probably
>>> indeed better the way you have it. However, I'm now wondering
>>> about something else: make_cr3() leaves PCID as zero for HVM
>>> and idle domains, but runs Xen with PCIDs 2 and 3 for (some) PV
>>> domains. That looks like an undesirable setup though - it would
>>> seem better to run Xen (with full page tables) with PCID 0 at all
>>> times.
>>>
>>> Then we'd have e.g.
>>> PCID 0      Xen (full page tables)
>>> PCID x      PV guest priv
>>> PCID y      PV guest user
>>
>> So this would need another way to switch between guest and xen %cr3.
>> Or would you want to use different %cr3 values with the same PCID
>> without flushing the TLB in between? This seems to be a way to ask for
>> problems...
> 
> Well, a TLB flush is clearly needed in such a setup when going
> from kernel to user mode.
> 
>> In case you'd use the same %cr3 (guest kernel one, I guess) for both
>> cases: are you really sure there is no problem in any hypervisor path
>> accessing guest data which would result in using guest kernel access
>> rights when coming from user mode (BTW: that was the security note I
>> had in v2 of my series).
> 
> I'm afraid I don't understand: Same %cr3? There are separate
> kernel and user page tables, requiring different values anyway.
> I also don't understand what problems in hypervisor code paths
> you suspect, when everything looks to work fine right now
> without PCID.

With switching between different page tables you need to flush the TLB.
That was my point.

> 
>>> Global pages in PCID 0 could then still be permitted, and wouldn't
>>> ever need flushing except when FLUSH_TLB_GLOBAL is requested.
>>>
>>> As to the use of two separate PCIDs for PV kernel and user modes
>>> - while this helps isolation, it prevents recovering the non-XPTI
>>> property of user mode TLB entries surviving in-guest mode switches.
>>
>> I don't get that. With PCID the guest's kernel _and_ user entries
>> will survive in-guest mode switches as there is no TLB flushing
>> involved (the no-flush bit is set in v->arch.cr3 for both modes).
>>
>> The only downside are guest kernel accesses to user pages: they will
>> need additional TLB entries as the PCID is different.
> 
> That's the point I was trying to make. This was further explained
> in my previous reply a little further down.
> 
>>> I wonder whether this is part of the reason you see PCID have a
>>> negative effect in the non-XPTI case.
>>>
>>> So in the end the question is: Why not use just two PCIDs, and
>>> allow global pages just like we do now, with the added benefit
>>> that we no longer need to flush Xen's global TLB entries just
>>> because we want to get rid of PV guest user ones.
>>
>> I can't see how that would work without either needing some more TLB
>> flushes in order to prevent stale TLB entries or loosing the Meltdown
>> mitigation.
>>
>> Which %cr3/PCID combination should be used in hypervisor, guest kernel
>> and guest user mode?
> 
> Xen would run with PCID 0 (and full Xen mappings) at all times
> (except early entry and late exit code of course). The guest would
> run with PCID 1 (and minimal Xen mappings) at all times. The switch
> of PCID eliminates the need for flushes on the way out and back in.

You still need the kernel page tables flushed when switching to user
mode, right?

> 
>> Which pages would be global?
> 
> Use of global pages would continue to be as today: Xen has some,
> and guest user mode has some. Of course it is quite possible that
> the use of global pages with a single guest PCID is still worse than
> no global pages with two guest PCIDs, but that's a separate step
> to take (and measure) imo.

But global pages of Xen would either make it vulnerable with regard to
Meltdown or you need a TLB flush again when switching between Xen and
guest making all the PCID stuff moot.

> 
>>>> I don't
>>>> want to use global guest user pages together with PCID as flushing
>>>> global pages from the TLB with PCID enabled requires flushing either
>>>> the complete TLB or you'd have to use INVLPG in all possible address
>>>> spaces (so you'd need to have multiple %cr3 switches).
>>>
>>> Well, yes, flushing _individual_ pages is a problem in that case.
>>> As to multiple CR3 switches - are these all that bad really with
>>> the no-flush bit set? With the reduced number of PCIDs in actual
>>> use (as discussed above) "all possible address spaces" would
>>> mean just two. And I could imagine that in a number of cases
>>> just one INVLPG (with the right PCID active) might suffice.
>>>
>>> One complicating factor is that we don't want to introduce
>>> Xen TLB entries for other than what we map in the minimal page
>>> tables into PV guest PCID space, which would happen if we
>>> simply switched PCID around an INVLPG.
>>>
>>> What I don't understand in any event is why you need separate
>>> PCIDs for Xen depending on whether the active PV guest is in
>>> kernel or user mode.
>>
>> Main reason are the different page tables anchored in %cr3.
> 
> Hmm, right, looks like we can't have the best of both worlds: We'd
> like the Xen part of the address space to be shared, but the guest
> part of it to be separate. Question then still is whether the reduced
> flushing outweighs the reduced sharing. IOW between what we
> have today (a single PCID and a lot of flushing) and what you
> introduce (four PCIDs and very little flushing) is a middle approach -
> two PCIDs plus some flushing.

As I believe kernel is touching much less user pages than kernel pages
I'm pretty sure my approach is better in most cases.

And TBH I'm still not sure the "some flushing" wouldn't be too
expensive.

So lets compare the possibilities:

My approach:
- no global pages
- 4 different PCIDs
- no TLB flushes needed when switching between Xen and guest
- no TLB flushes needed when switching between guest user and kernel
- flushing of single pages (guest or Xen) rather simple (4 INVPCIDs)
- flushing of complete TLB via 1 INVPCID

2 PCIDs (Xen and guest), keeping guest user pages as global pages
- Xen can't use global pages - global bit must be handled dynamically
  for Xen pages (or do we want to drop global pages e.g. for AMD, too?
- 2 PCIDs
- no TLB flushes needed when switching between Xen and guest
- when switching from guest kernel to guest user the kernel pages must
  be flushed from TLB
- flushing of single guest user pages needs 2 changes of %cr3 and 2
  INVLPGs, switch code must be mapped to guest page tables
- flushing of complete TLB via 1 INVPCID

So the advantage of the 2 PCID solution are the single TLB entries for
guest user pages compared to 2 entries for guest user pages accessed by
the guest kernel or Xen.

The disadvantage are the flushed guest kernel pages when executing user
code, the more complicated single user page flushing and the dynamical
Xen global bit handling.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.