[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server



> -----Original Message-----
> From: Yu, Zhang [mailto:yu.c.zhang@xxxxxxxxxxxxxxx]
> Sent: 14 April 2016 11:45
> To: George Dunlap; Paul Durrant; xen-devel@xxxxxxxxxxxxx
> Cc: Jan Beulich; Kevin Tian; Andrew Cooper; Lv, Zhiyuan; Tim (Xen.org);
> jun.nakajima@xxxxxxxxx
> Subject: Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to
> map guest ram with p2m_ioreq_server to an ioreq server
> 
> On 4/11/2016 7:15 PM, Yu, Zhang wrote:
> >
> >
> > On 4/8/2016 7:01 PM, George Dunlap wrote:
> >> On 08/04/16 11:10, Yu, Zhang wrote:
> >> [snip]
> >>> BTW, I noticed your reply has not be CCed to mailing list, and I also
> >>> wonder if we should raise this last question in community?
> >>
> >> Oops -- that was a mistake on my part.  :-)  I appreciate the
> >> discretion; just so you know in the future, if I'm purposely changing
> >> the CC list (removing xen-devel and/or adding extra people), I'll almost
> >> always say so at the top of the mail.
> >>
> >>>> And then of course there's the p2m_ioreq_server ->
> p2m_ram_logdirty
> >>>> transition -- I assume that live migration is incompatible with this
> >>>> functionality?  Is there anything that prevents a live migration from
> >>>> being started when there are outstanding p2m_ioreq_server entries?
> >>>>
> >>>
> >>> Another good question, and the answer is unfortunately yes. :-)
> >>>
> >>> If live migration happens during the normal emulation process, entries
> >>> marked with p2m_ioreq_server will be changed to p2m_log_dirty in
> >>> resolve_misconfig(), and later write operations will change them to
> >>> p2m_ram_rw, thereafter these pages can not be forwarded to device
> model.
> >>>  From this point of view, this functionality is incompatible with live
> >>> migration.
> >>>
> >>> But for XenGT, I think this is acceptable, because, if live migration
> >>> is to be supported in the future, intervention from backend device
> >>> model will be necessary. At that time, we can guarantee from the device
> >>> model side that there's no outdated p2m_ioreq_server entries, hence
> no
> >>> need to reset the p2m type back to p2m_ram_rw(and do not include
> >>> p2m_ioreq_server in the P2M_CHANGEABLE_TYPES). By "outdated", I
> mean
> >>> when an ioreq server is detached from p2m_ioreq_server, or before an
> >>> ioreq server is attached to this type, entries marked with
> >>> p2m_ioreq_server should be regarded as outdated.
> >>>
> >>> Is this acceptible to you? Any suggestions?
> >>
> >> So the question is, as of this series, what happens if someone tries to
> >> initiate a live migration while there are outstanding p2m_ioreq_server
> >> entries?
> >>
> >> If the answer is "the ioreq server suddenly loses all control of the
> >> memory", that's something that needs to be changed.
> >>
> >
> > Sorry, for this patch series, I'm afraid the above description is the
> > answer.
> >
> > Besides, I find it's hard to change current code to both support the
> > deferred resetting of p2m_ioreq_server and the live migration at the
> > same time. One reason is that a page with p2m_ioreq_server behaves
> > differently in different situations.
> >
> > My assumption of XenGT is that, for live migration to work, the device
> > model should guarantee there's no outstanding p2m_ioreq_server pages
> > in hypervisor(no need to use the deferred recalculation), and it is our
> > device model who should be responsible for the copying of the write
> > protected guest pages later.
> >
> > And another solution I can think of: when unmapping the ioreq server,
> > we walk the p2m table and reset entries with p2m_ioreq_server back
> > directly, instead of deferring the reset. And of course, this means
> > performance impact. But since the mapping and unmapping of an ioreq
> > server is not a frequent one, the performance penalty may be acceptable.
> > How do you think about this approach?
> >
> 
> George, sorry to bother you. Any comments on above option? :)
> 
> Another choice might be to let live migration fail if there's
> outstanding p2m_ioreq_server entries. But I'm not quite inclined to do
> so, because:
> 1> I'd still like to keep live migration feature for XenGT.
> 2> Not easy to know if there's outstanding p2m_ioreq_server entries. I
> mean, since p2m type change is not only triggered by hypercall, to keep
> a counter for remaining p2m_ioreq_server entries means a lot code
> changes;
> 
> Besides, I wonder whether the requirement to reset the p2m_ioreq_server
> is indispensable, could we let the device model side to be responsible
> for this? The worst case I can imagine for device model failing to do
> so is that operations of a gfn might be delivered to a wrong device
> model. I'm not clear what kind of damage would this cause to the
> hypervisor or other VM.
> 
> Does any other maintainers have any suggestions?

Note that it is a requirement that an ioreq server be disabled before VM 
suspend. That means ioreq server pages essentially have to go back to ram_rw 
semantics.

  Paul

> Thanks in advance! :)
> >> If the answer is, "everything just works", that's perfect.
> >>
> >> If the answer is, "Before logdirty mode is set, the ioreq server has the
> >> opportunity to detach, removing the p2m_ioreq_server entries, and
> >> operating without that functionality", that's good too.
> >>
> >> If the answer is, "the live migration request fails and the guest
> >> continues to run", that's also acceptable.  If you want this series to
> >> be checked in today (the last day for 4.7), this is probably your best
> >> bet.
> >>
> >>   -George
> >>
> >>
> >>
> >
> 
> Regards
> Yu
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.