[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/3] tools: introduce parameter max_wp_ram_ranges.





On 2/4/2016 2:21 AM, George Dunlap wrote:
On Wed, Feb 3, 2016 at 5:41 PM, George Dunlap
<George.Dunlap@xxxxxxxxxxxxx> wrote:
I think at some point I suggested an alternate design based on marking
such gpfns with a special p2m type; I can't remember if that
suggestion was actually addressed or not.

FWIW, the thread where I suggested using p2m types was in response to

<1436163912-1506-2-git-send-email-yu.c.zhang@xxxxxxxxxxxxxxx>

Looking through it again, the main objection Paul gave[1]  was:

"And it's the assertion that use of write_dm will only be relevant to
gfns, and that all such notifications only need go to a single ioreq
server, that I have a problem with. Whilst the use of io ranges to
track gfn updates is, I agree, not ideal I think the overloading of
write_dm is not a step in the right direction."

Two issues raised here, about using only p2m types to implement write_dm:
1. More than one ioreq server may want to use the write_dm functionality
2. ioreq servers may want to use write_dm for things other than individual gpfns

My answer to #1 was:
1. At the moment, we only need to support a single ioreq server using write_dm
2. It's not technically difficult to extend the number of servers
supported to something sensible, like 4 (using 4 different write_dm
p2m types)
3. The interface can be designed such that we can extend support to
multiple servers when we need to.

My answer to #2 was that there's no reason why using write_dm could be
used for both individual gpfns and ranges; there's no reason the
interface can't take a "start" and "count" argument, even if for the
time being "count" is almost always going to be 1.


Well, talking about "the 'count' always going to be 1". I doubt that. :)
Statistics in XenGT shows that, GPU page tables are very likely to
be allocated in contiguous gpfns.

Compare this to the downsides of the approach you're proposing:
1. Using 40 bytes of hypervisor space per guest GPU pagetable page (as
opposed to using a bit in the existing p2m table)
2. Walking down an RB tree with 8000 individual nodes to find out
which server to send the message to (rather than just reading the
value from the p2m table).

8K is an upper limit for the rangeset, in many cases the RB tree will
not contain that many nodes.

3. Needing to determine on a guest-by-guest basis whether to change the limit
4. Needing to have an interface to make the limit even bigger, just in
case we find workloads that have even more GTTs.


Well, I have suggested in yesterday's reply. XenGT can choose not to
change this limit even when workloads are getting heavy - with
tradeoffs in the device model side.

I really don't understand where you're coming from on this.  The
approach you've chosen looks to me to be slower, more difficult to
implement, and more complicated; and it's caused a lot more resistance
trying to get this series accepted.


I agree utilizing the p2m types to do so is more efficient and quite
intuitive. But I hesitate to occupy the software available bits in EPT
PTEs(like Andrew's reply). Although we have introduced one, we believe it can also be used for other situations in the future, not just XenGT.

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.