[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.





On 3/24/2017 6:19 PM, Jan Beulich wrote:
On 24.03.17 at 10:05, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
On 3/23/2017 4:57 PM, Jan Beulich wrote:
On 23.03.17 at 04:23, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
On 3/22/2017 10:21 PM, Jan Beulich wrote:
On 21.03.17 at 03:52, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
@@ -177,8 +178,64 @@ static int hvmemul_do_io(
            break;
        case X86EMUL_UNHANDLEABLE:
        {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
+        /*
+         * Xen isn't emulating the instruction internally, so see if
+         * there's an ioreq server that can handle it. Rules:
+         *
+         * - PIO and "normal" MMIO run through hvm_select_ioreq_server()
+         * to choose the ioreq server by range. If no server is found,
+         * the access is ignored.
+         *
+         * - p2m_ioreq_server accesses are handled by the designated
+         * ioreq_server for the domain, but there are some corner
+         * cases:
+         *
+         *   - If the domain ioreq_server is NULL, assume there is a
+         *   race between the unbinding of ioreq server and guest fault
+         *   so re-try the instruction.
And that retry won't come back here because of? (The answer
should not include any behavior added by subsequent patches.)
You got me. :)
In this patch, retry will come back here. It should be after patch 4 or
patch 5 that the retry
will be ignored(p2m type changed back to p2m_ram_rw after the unbinding).
In which case I think we shouldn't insist on you to change things, but
you should spell out very clearly that this patch should not go in
without the others going in at the same time.
So maybe it would be better we leave the retry part to a later patch,
say patch 4/5 or patch 5/5,
and return unhandleable in this patch?
I don't follow. I've specifically suggested that you don't change
the code, but simply state clearly the requirement that patches
2...5 of this series should all go in at the same time. I don't mind
you making changes, but the risk then is that further round trips
may be required because of there being new issues with the
changes you may do.

Thanks, Jan. I'll keep the code, and add a note in the commit message of this patch.

--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -172,7 +172,7 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
L1_gpa, paddr_t *L0_gpa,
        if ( *p2mt == p2m_mmio_direct )
            goto direct_mmio_out;
        rc = NESTEDHVM_PAGEFAULT_MMIO;
-    if ( *p2mt == p2m_mmio_dm )
+    if ( *p2mt == p2m_mmio_dm || *p2mt == p2m_ioreq_server )
Btw., how does this addition match up with the rc value being
assigned right before the if()?
Well returning a NESTEDHVM_PAGEFAULT_MMIO in such case will trigger
handle_mmio() later in
hvm_hap_nested_page_fault(). Guess that is what we expected.
That's probably what is expected, but it's no MMIO which we're
doing in that case. And note that we've stopped abusing
handle_mmio() for non-MMIO purposes a little while ago (commit
3dd00f7b56 ["x86/HVM: restrict permitted instructions during
special purpose emulation"]).
OK. So what about we just remove this "*p2mt == p2m_ioreq_server"?
Well, you must have had a reason to add it. To be honest, I don't
care too much about the nested code (as it's far from production
ready anyway), so leaving the code above untouched would be
fine with me, but taking care of adjustments to nested code where
they're actually needed would be even better. So the preferred
option is for you to explain why you've done the change above,
and why you think it's correct/needed. The next best option might
be to drop the change.

Got it. I now prefer to drop the change. This code was added at the early stage of this patchset when we hope p2m_ioreq_server can always trigger a handle_mmio(), but frankly we do not, and probably
will not use the nested case in the foreseeable future.

--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, 
ept_entry_t *entry,
                entry->r = entry->w = entry->x = 1;
                entry->a = entry->d = !!cpu_has_vmx_ept_ad;
                break;
+        case p2m_ioreq_server:
+            entry->r = 1;
+            entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
Is this effectively open coded p2m_get_ioreq_server() actually
okay? If so, why does the function need to be used elsewhere,
instead of doing direct, lock-free accesses?
Maybe your comments is about whether it is necessary to use the lock in
p2m_get_ioreq_server()?
I still believe so, it does not only protect the value of ioreq server,
but also the flag together with it.

Besides, it is used not only in the emulation process, but also the
hypercall to set the mem type.
So the lock can still provide some kind protection against the
p2m_set_ioreq_server() - even it does
not always do so.
The question, fundamentally, is about consistency: The same
access model should be followed universally, unless there is an
explicit reason for an exception.
Sorry, I do not quite understand. Why the consistency is broken?
Because you don't call p2m_get_ioreq_server() here (discarding
the return value, but using the flags).

Oh. I see. You are worrying about p2m->ioreq.server/flag being cleared due to an unmap.
Below are some situation I can think of...

I think this lock at least protects the ioreq server and the flag. The
only exception
is the one you mentioned - s could become stale which we agreed to let
the device
model do the check. Without this lock, things would become more complex
- more
race conditions...
Sure, all understood. I wasn't really suggesting to drop the locked
accesses, but instead I was using this to illustrate the non-locked
access (and hence the inconsistency with other code) here. As
said - if there's a good reason not to call the function here, I'm all
ears.

... the ept_p2m_type_to_flags() is used in 2 cases:
1> in resolve_misconfig() to do the recalculation for a p2m entry - in such case, we won't meet a p2m_ioreq_server type in ept_p2m_type_to_flags(), because it's already recalculated
back to p2m_ram_rw in the caller;

2> triggered by p2m_set_entry() which is trying to set mem type for some gfns. The only scenario I can imagine(which is also an extreme one) that may have racing potential is: during the mem type setting process, the ioreq server unmapping is triggered on another cpu, which then invalidates the value p2m->ioreq.flag, in such case ept_p2m_type_to_flag() will return an entry with writable permission. But right after the mem type setting is done, p2m lock will be freed and the unmapping hypercall will get opportunity to reset this p2m
entry. We do not need a write-protected entry in such case anyway.

Besides, even the p2m_get_ioreq_server() is used here in ept_p2m_type_to_flags(), it can only provide a limited protection, there's a chance that returned flag in ept_p2m_type_to_flags()
be outdated in above situation.

So, since we do not really care about the p2m entry permission in such extreme situation, and we can not 100% guarantee the lock will protect it, I do not think we need to use the lock here.

I am not sure if this explanation is convincing to you, but I'm also open to be convinced. :-)

Thanks
Yu


Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.