[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server

On 4/11/2016 7:15 PM, Yu, Zhang wrote:

On 4/8/2016 7:01 PM, George Dunlap wrote:
On 08/04/16 11:10, Yu, Zhang wrote:
BTW, I noticed your reply has not be CCed to mailing list, and I also
wonder if we should raise this last question in community?

Oops -- that was a mistake on my part.  :-)  I appreciate the
discretion; just so you know in the future, if I'm purposely changing
the CC list (removing xen-devel and/or adding extra people), I'll almost
always say so at the top of the mail.

And then of course there's the p2m_ioreq_server -> p2m_ram_logdirty
transition -- I assume that live migration is incompatible with this
functionality?  Is there anything that prevents a live migration from
being started when there are outstanding p2m_ioreq_server entries?

Another good question, and the answer is unfortunately yes. :-)

If live migration happens during the normal emulation process, entries
marked with p2m_ioreq_server will be changed to p2m_log_dirty in
resolve_misconfig(), and later write operations will change them to
p2m_ram_rw, thereafter these pages can not be forwarded to device model.
 From this point of view, this functionality is incompatible with live

But for XenGT, I think this is acceptable, because, if live migration
is to be supported in the future, intervention from backend device
model will be necessary. At that time, we can guarantee from the device
model side that there's no outdated p2m_ioreq_server entries, hence no
need to reset the p2m type back to p2m_ram_rw(and do not include
p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I mean
when an ioreq server is detached from p2m_ioreq_server, or before an
ioreq server is attached to this type, entries marked with
p2m_ioreq_server should be regarded as outdated.

Is this acceptible to you? Any suggestions?

So the question is, as of this series, what happens if someone tries to
initiate a live migration while there are outstanding p2m_ioreq_server

If the answer is "the ioreq server suddenly loses all control of the
memory", that's something that needs to be changed.

Sorry, for this patch series, I'm afraid the above description is the

Besides, I find it's hard to change current code to both support the
deferred resetting of p2m_ioreq_server and the live migration at the
same time. One reason is that a page with p2m_ioreq_server behaves
differently in different situations.

My assumption of XenGT is that, for live migration to work, the device
model should guarantee there's no outstanding p2m_ioreq_server pages
in hypervisor(no need to use the deferred recalculation), and it is our
device model who should be responsible for the copying of the write
protected guest pages later.

And another solution I can think of: when unmapping the ioreq server,
we walk the p2m table and reset entries with p2m_ioreq_server back
directly, instead of deferring the reset. And of course, this means
performance impact. But since the mapping and unmapping of an ioreq
server is not a frequent one, the performance penalty may be acceptable.
How do you think about this approach?

George, sorry to bother you. Any comments on above option? :)

Another choice might be to let live migration fail if there's
outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
so, because:
1> I'd still like to keep live migration feature for XenGT.
2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
mean, since p2m type change is not only triggered by hypercall, to keep
a counter for remaining p2m_ioreq_server entries means a lot code

Besides, I wonder whether the requirement to reset the p2m_ioreq_server
is indispensable, could we let the device model side to be responsible
for this? The worst case I can imagine for device model failing to do
so is that operations of a gfn might be delivered to a wrong device
model. I'm not clear what kind of damage would this cause to the
hypervisor or other VM.

Does any other maintainers have any suggestions?
Thanks in advance! :)
If the answer is, "everything just works", that's perfect.

If the answer is, "Before logdirty mode is set, the ioreq server has the
opportunity to detach, removing the p2m_ioreq_server entries, and
operating without that functionality", that's good too.

If the answer is, "the live migration request fails and the guest
continues to run", that's also acceptable.  If you want this series to
be checked in today (the last day for 4.7), this is probably your best



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.