[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server





On 4/18/2016 11:57 PM, Paul Durrant wrote:
-----Original Message-----
From: Yu, Zhang [mailto:yu.c.zhang@xxxxxxxxxxxxxxx]
Sent: 14 April 2016 11:45
To: George Dunlap; Paul Durrant; xen-devel@xxxxxxxxxxxxx
Cc: Jan Beulich; Kevin Tian; Andrew Cooper; Lv, Zhiyuan; Tim (Xen.org);
jun.nakajima@xxxxxxxxx
Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
map guest ram with p2m_ioreq_server to an ioreq server

On 4/11/2016 7:15 PM, Yu, Zhang wrote:


On 4/8/2016 7:01 PM, George Dunlap wrote:
On 08/04/16 11:10, Yu, Zhang wrote:
[snip]
BTW, I noticed your reply has not be CCed to mailing list, and I also
wonder if we should raise this last question in community?

Oops -- that was a mistake on my part.  :-)  I appreciate the
discretion; just so you know in the future, if I'm purposely changing
the CC list (removing xen-devel and/or adding extra people), I'll almost
always say so at the top of the mail.

And then of course there's the p2m_ioreq_server ->
p2m_ram_logdirty
transition -- I assume that live migration is incompatible with this
functionality?  Is there anything that prevents a live migration from
being started when there are outstanding p2m_ioreq_server entries?


Another good question, and the answer is unfortunately yes. :-)

If live migration happens during the normal emulation process, entries
marked with p2m_ioreq_server will be changed to p2m_log_dirty in
resolve_misconfig(), and later write operations will change them to
p2m_ram_rw, thereafter these pages can not be forwarded to device
model.
  From this point of view, this functionality is incompatible with live
migration.

But for XenGT, I think this is acceptable, because, if live migration
is to be supported in the future, intervention from backend device
model will be necessary. At that time, we can guarantee from the device
model side that there's no outdated p2m_ioreq_server entries, hence
no
need to reset the p2m type back to p2m_ram_rw(and do not include
p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I
mean
when an ioreq server is detached from p2m_ioreq_server, or before an
ioreq server is attached to this type, entries marked with
p2m_ioreq_server should be regarded as outdated.

Is this acceptible to you? Any suggestions?

So the question is, as of this series, what happens if someone tries to
initiate a live migration while there are outstanding p2m_ioreq_server
entries?

If the answer is "the ioreq server suddenly loses all control of the
memory", that's something that needs to be changed.


Sorry, for this patch series, I'm afraid the above description is the
answer.

Besides, I find it's hard to change current code to both support the
deferred resetting of p2m_ioreq_server and the live migration at the
same time. One reason is that a page with p2m_ioreq_server behaves
differently in different situations.

My assumption of XenGT is that, for live migration to work, the device
model should guarantee there's no outstanding p2m_ioreq_server pages
in hypervisor(no need to use the deferred recalculation), and it is our
device model who should be responsible for the copying of the write
protected guest pages later.

And another solution I can think of: when unmapping the ioreq server,
we walk the p2m table and reset entries with p2m_ioreq_server back
directly, instead of deferring the reset. And of course, this means
performance impact. But since the mapping and unmapping of an ioreq
server is not a frequent one, the performance penalty may be acceptable.
How do you think about this approach?


George, sorry to bother you. Any comments on above option? :)

Another choice might be to let live migration fail if there's
outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
so, because:
1> I'd still like to keep live migration feature for XenGT.
2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
mean, since p2m type change is not only triggered by hypercall, to keep
a counter for remaining p2m_ioreq_server entries means a lot code
changes;

Besides, I wonder whether the requirement to reset the p2m_ioreq_server
is indispensable, could we let the device model side to be responsible
for this? The worst case I can imagine for device model failing to do
so is that operations of a gfn might be delivered to a wrong device
model. I'm not clear what kind of damage would this cause to the
hypervisor or other VM.

Does any other maintainers have any suggestions?

Note that it is a requirement that an ioreq server be disabled before VM 
suspend. That means ioreq server pages essentially have to go back to ram_rw 
semantics.

   Paul


OK. So it should be hypervisor's responsibility to do the resetting.
Now we probably have 2 choices:
1> we reset the p2m type synchronously when ioreq server unmapping
happens, instead of deferring to the misconfig handling part. This
means performance impact to traverse the p2m table.

Or
2> we just disallow live migration when p2m->ioreq.server is not NULL.
This is not quite accurate, because having p2m->ioreq.server mapped
to p2m_ioreq_server does not necessarily means there would be such
outstanding entries. To be more accurate, we can add some other rough
check, e.g. both check if p2m->ioreq.server against NULL and check if
the hvmop_set_mem_type has ever been triggered once for the
p2m_ioreq_server type.

Both choice seems suboptimal for me. And I wonder if we have any
better solutions?

Thanks
Yu

Thanks in advance! :)
If the answer is, "everything just works", that's perfect.

If the answer is, "Before logdirty mode is set, the ioreq server has the
opportunity to detach, removing the p2m_ioreq_server entries, and
operating without that functionality", that's good too.

If the answer is, "the live migration request fails and the guest
continues to run", that's also acceptable.  If you want this series to
be checked in today (the last day for 4.7), this is probably your best
bet.

   -George





Regards
Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.