[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about VPID during MOV-TO-CR3



>>> On 04.10.16 at 17:06, <tim@xxxxxxx> wrote:
> At 08:29 -0600 on 04 Oct (1475569774), Jan Beulich wrote:
>> >>> On 04.10.16 at 16:12, <tamas.lengyel@xxxxxxxxxxxx> wrote:
>> > yes, I understand that is the case when you do need to flush a guest.
>> > And yes, there seem to be paths that require to bump the tag of a
>> > specific guest for certain events (mov-to-cr4 with paging mode changes
>> > for example). What I'm poking at it here is that we invalidate the
>> > guest TLBs for _all_ guests very frequently. I can't find an
>> > explanation for why _that_ is required. AFAIK having the TLB tag
>> > guarantees that no other guest or Xen will have a chance to bump into
>> > stale entries given no guests or Xen share a TLB tag with each other.
>> > So the only time I see that we would have to flush all guest TLBs is
>> > when the tag overflows and we start from 1 again. What am I missing
>> > here?
>> 
>> Oh, I see - this indeed looks to be quite a bit more flushing than is
>> desirable. So the question, as you did put it already, is why it got
>> done that way in the first place. At the very least it would look like
>> more control would need to be given to the callers of both
>> write_cr3() and flush_area_local(). Tim?
> 
> IIRC:
>  - Remote TLB flushes are used for safety, e.g. to be sure that no
>    guest has a mapping of a page before its type or owner changes.
>    The callers rely on _all_ mappings of the page being gone after
>    the remote flush.  The simplest way to do that is to flush all tags.

Ah, of course. And that means that no matter that Tamas observed
no breakage with some of the flushing removed, it can't be dropped
altogether.

>  - We believed that on the then-current hardware, and with the
>    scheduling timeslice we had, there wasn't an awful lot of
>    benefit to keeping the tags of descheduled VMs around.
>  - Although it might sometimes be safe to leave some tags unflushed,
>    it wasn't clear exactly when that would be.  E.g. I don't think
>    that whether the tag is 'current' is a very useful test -- either
>    the tag might contain dangerous mappings or it might not.
> 
> Since there are cases where we already mask TLB flushes by domain
> (usign the dirty-cpumask) I can see that we might pass that domain ID
> to the remote CPU and drop only that domain's tags.
> 
> And for HAP guests it may be possible to distinguish between "guest"
> flushes (e.g. emulating guest CR3 writes) and "hypervisor" flushes
> (e.g. after grant/p2m ops), and target "guest" flushes at particular
> VCPUs.

Right. Question is whether there are any such operations
occurring frequently enough that optimizing this would make
sense. I don't see HVM code paths leading to write_cr3(), and
I don't think there are a whole lot leading to flush_area_local().
Did you gain any insight in this regard, Tamas?

The thing that would really help us would be some INVLPG
equivalent allowing a size/mask to be provided along with the
address (as that other path in flush_area_local() doesn't have
all these problems). Otoh, Tim - if INVLPG was sufficient for order
zero, how come ASID based full invalidation is required on the
other path? Wouldn't this need to be accompanied by a suitable
INVVPID/INVLPGA?

Jan

> Both of those will want careful unpicking from existing safety
> mechanisms that assume that a flush is a flush.  E.g. the
> tlbflush_timestamp used on page allocation skips a shootdown if _any_
> TLB flush has happened on the remote PCPU since the page was freed.
> Partial flushes can't count towards that.  And there might be other
> gotchas that I can't think of right now.
> 
> Cheers,
> 
> Tim.




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.