[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.





On 6/20/2016 9:13 PM, George Dunlap wrote:
On 20/06/16 12:28, Yu Zhang wrote:

On 6/20/2016 6:55 PM, Jan Beulich wrote:
On 20.06.16 at 12:32, <george.dunlap@xxxxxxxxxx> wrote:
On 20/06/16 11:25, Jan Beulich wrote:
On 20.06.16 at 12:10, <george.dunlap@xxxxxxxxxx> wrote:
On 20/06/16 10:03, Yu Zhang wrote:
However, there are conflicts if we take live migration  into account,
i.e. if the live migration is
triggered by the user(unintentionally maybe) during the gpu emulation
process, resolve_misconfig()
will set all the outstanding p2m_ioreq_server entries to
p2m_log_dirty,
which is not what we expected,
because our intention is to only reset the outdated p2m_ioreq_server
entries back to p2m_ram_rw.
Well the real problem in the situation you describe is that a second
"lazy" p2m_change_entry_type_global() operation is starting before the
first one is finished.  All that's needed to resolve the situation is
that if you get a second p2m_change_entry_type_global() operation
while
there are outstanding entries from the first type change, you have to
finish the first operation (i.e., go "eagerly" find all the
misconfigured entries and change them to the new type) before starting
the second one.
Eager resolution of outstanding entries can't be the solution here, I
think, as that would - afaict - be as time consuming as doing the type
change synchronously right away.
But isn't it the case that p2m_change_entry_type_global() is only
implemented for EPT?
Also for NPT, we're using a similar model in p2m-pt.c (see e.g. the
uses of RECALC_FLAGS - we're utilizing the _PAGE_USER set
unconditionally leads to NPF). And since shadow sits on top of
p2m-pt, that should be covered too.

   So we've been doing the slow method for both
shadow and AMD HAP (whatever it's called these days) since the
beginning.  And in any case we'd only have to go for the "slow" case in
circumstances where the 2nd type change happened before the first one
had completed.
We can't even tell when one have fully finished.
I agree, we have no idea if the previous type change is completely done.
Besides, IIUC, the p2m_change_entry_type_gobal() is not a quite slow
method, because it does
not invalidate all the paging structure entries at once, it just writes
the upper level ones, others
are updated in resolve_misconfig().

   p2m_change_entry_type_global(),
at least right now, can be invoked freely without prior type changes
having fully propagated. The logic resolving mis-configured entries
simply needs to be able to know the correct new type. I can't see
why this logic shouldn't therefore be extensible to this new type
which can be in flight - after we ought to have a way to know what
type a particular GFN is supposed to be?
Actually, come to think of it -- since the first type change is meant to
convert all ioreq_server -> ram_rw, and the second is meant to change
all ram_rw -> logdirty,  is there any case in which we *wouldn't* want
the resulting type to be logdirty?  Isn't that exactly what we'd get if
we'd done both operations synchronously?
I think Yu's concern is for pages which did not get converted back?
Or on the restore side? Otherwise - "yes" to both of your questions.

Yes. My concern is that resolve_misconfig() can not easily be extended
to differentiate the
p2m_ioreq_server entries which need to be reset and the normal
p2m_ioreq_server entries.
Under what circumstance should resolve_misconfig() change a
misconfigured entry into a p2m_ioreq_server entry?

Oh, I did not mean that. Routine resolve_misconfig() shall not change
any entry into a p2m_ioreq_server type. I hoped this routine could be
changed to reset outdated p2m_ioreq_server entries(by "outdated" I
refer to the entries which are no longer tracked by an ioreq server but
remain as p2m_ioreq_server) back to p2m_ram_rw type.

Later I realized that we may also change the normal p2m_ioreq_server
entries(by "nomal" I mean the gfns which are in emulation process) if
live migration is triggered during emulation process. And it's hard to
distinguish the outdated p2m_ioreq_server entries and the normal ones.

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.