[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/3] tools: introduce parameter max_wp_ram_ranges.



Hi George,

On Fri, Feb 05, 2016 at 11:05:39AM +0000, George Dunlap wrote:
> On Fri, Feb 5, 2016 at 3:44 AM, Tian, Kevin <kevin.tian@xxxxxxxxx> wrote:
> >> > So as long as the currently-in-use GTT tree contains no more than
> >> > $LIMIT ranges, you can unshadow and reshadow; this will be slow, but
> >> > strictly speaking correct.
> >> >
> >> > What do you do if the guest driver switches to a GTT such that the
> >> > entire tree takes up more than $LIMIT entries?
> >>
> >> GPU has some special properties different from CPU, which make things
> >> easier. The GPU page table is constructed by CPU and used by GPU
> >> workloads. GPU workload itself will not change the page table.
> >> Meanwhile, GPU workload submission, in virtualized environment, is
> >> controled by our device model. Then we can reshadow the whole table
> >> every time before we submit workload. That can reduce the total number
> >> required for write protection but with performance impact, because GPU
> >> has to have more idle time waiting for CPU. Hope the info helps.
> >> Thanks!
> >>
> >
> > Putting in another way, it's fully under mediation when a GPU page table
> > (GTT) will be referenced by the GPU, so there're plenty of room to
> > optimize existing shadowing (always shadowing all recognized GPU page
> > tables), e.g. shadowing only active one when a VM is scheduled in. It's
> > a performance matter but no correctness issue.
> >
> > This is why Yu mentioned earlier whether we can just set a default
> > limit which is good for majority of use cases, while extending our
> > device mode to drop/recreate some shadow tables upon the limitation
> > is hit. I think this matches how today's CPU shadow page table is
> > implemented, which also has a limitation of how many shadow pages
> > are allowed per-VM.
> 
> I don't think you've understood my question (or maybe I still don't
> understood the situation properly).
> 
> So in memory pagetables, there's a "tree" that contains a single
> top-level page, which points to other pages, which defines one address
> space (usually corresponding to one process or thread).   (This is
> often just refered to as 'cr3', since it's the value you write into
> the cr3 register on x86 processors.) I'm assuming that the structure
> is similar for your GPU translation tables -- that a single GTT is
> effectively a "tree" sort of like a process address space for an OS.
> 
> And it sounds like what you're saying is: suppose we have 10 different
> GTTs (i.e., an entire tree / gpu thread), and each one require 1024
> ranges to shadow.  In that case, a limit of 8192 ranges means we can
> only keep 8 of the ten actually shadowed at any one time.  This is not
> optimal, since it will occasionally mean unshadowing an entire GTT and
> re-shadowing another one, but it will work, because we can always make
> sure that the currently-active GTT is shadowed.
> 
> My question is, suppose a single GTT / gpu thread / tree has 9000
> ranges.  It would be trivial for an attacker to break into the
> operating system and *construct* such a tree, but it's entirely
> possible that due to a combination of memory fragmentation and very
> large usage, the normal driver might accidentally create such a GTT.
> In that case, the device model will not be able to write-protect all
> the pages in the single GTT, and thus will not be able to correctly
> track changes to the currently-active GTT.  What does your device
> model do in that case?

We can live with the partially write protected tree. That is because
GPU's workload execution is controlled by the device model. We still
have chance to update the shadow page table before we submit workload
to GPU. The impact is performance not correctness. Thanks!

-Zhiyuan

> 
>  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.