[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.





On 6/14/2016 6:45 PM, Jan Beulich wrote:
On 19.05.16 at 11:05, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
let one ioreq server claim/disclaim its responsibility for the
handling of guest pages with p2m type p2m_ioreq_server. Users
of this HVMOP can specify which kind of operation is supposed
to be emulated in a parameter named flags. Currently, this HVMOP
only support the emulation of write operations. And it can be
easily extended to support the emulation of read ones if an
ioreq server has such requirement in the future.
Didn't we determine that this isn't as easy as everyone first thought?

My understanding is that to emulate read, we need to change the definition
of is_epte_present(), and I do not think this change will cause much trouble. But since no one is using the read emulation, I am convinced the more cautious
way is to only support emulations for write operations for now.

@@ -178,8 +179,33 @@ static int hvmemul_do_io(
          break;
      case X86EMUL_UNHANDLEABLE:
      {
-        struct hvm_ioreq_server *s =
-            hvm_select_ioreq_server(curr->domain, &p);
+        struct hvm_ioreq_server *s;
+        p2m_type_t p2mt;
+
+        if ( is_mmio )
+        {
+            unsigned long gmfn = paddr_to_pfn(addr);
+
+            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
+
+            if ( p2mt == p2m_ioreq_server )
+            {
+                unsigned long flags;
+
+                s = p2m_get_ioreq_server(currd, &flags);
+
+                if ( dir == IOREQ_WRITE &&
+                     !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
Shouldn't this be

                 if ( dir != IOREQ_WRITE ||
                      !(flags & P2M_IOREQ_HANDLE_WRITE_ACCESS) )
                     s = NULL;

in which case the question is whether you wouldn't better avoid
calling p2m_get_ioreq_server() in the first place when
dir != IOREQ_WRITE.

You are right. Since dir with IOREQ_READ is not supposed to enter this code path, I'd better change above code to check value of dir first before calling p2m_get_ioreq_server().

+                    s = NULL;
+            }
+            else
+                s = hvm_select_ioreq_server(currd, &p);
+        }
+        else
+        {
+            p2mt = p2m_invalid;
What is this needed for? In fact it looks like the variable decaration
could move into the next inner scope (alongside gmfn, which is
questionable to be a local variable anyway, considering that is gets
used just once).

Got it. Thanks.

@@ -914,6 +916,45 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, 
ioservid_t id,
      return rc;
  }
+int hvm_map_mem_type_to_ioreq_server(struct domain *d, ioservid_t id,
+                                     uint16_t type, uint32_t flags)
I see no reason why both can't be unsigned int.

Parameter type is passed in from the type field inside struct xen_hvm_map_mem_type_to_ioreq_server, which is a uint16_t, followed with a uint16_t pad. Now I am wondering, may be we can just remove the pad
field in this structure and just define type as uint32_t.

--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -132,6 +132,12 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, 
ept_entry_t *entry,
              entry->r = entry->w = entry->x = 1;
              entry->a = entry->d = !!cpu_has_vmx_ept_ad;
              break;
+        case p2m_ioreq_server:
+            entry->r = entry->x = 1;
Why x?

Setting entry->x to 1 is not a must. I can remove it. :)

@@ -94,8 +96,16 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t 
mfn,
      default:
          return flags | _PAGE_NX_BIT;
      case p2m_grant_map_ro:
-    case p2m_ioreq_server:
          return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
+    case p2m_ioreq_server:
+    {
+        flags |= P2M_BASE_FLAGS | _PAGE_RW;
+
+        if ( p2m->ioreq.flags & P2M_IOREQ_HANDLE_WRITE_ACCESS )
+            return flags & ~_PAGE_RW;
+        else
+            return flags;
+    }
Same here (for the missing _PAGE_NX) plus no need for braces.

I'll remove the brace. And we do not need to set the _PAGE_NX_BIT, like the p2m_ram_ro case I guess.


@@ -289,6 +291,74 @@ void p2m_memory_type_changed(struct domain *d)
      }
  }
+int p2m_set_ioreq_server(struct domain *d,
+                         unsigned long flags,
Why "long" and not just "int"?

You are right, will change to unsigned int in next version.

+                         struct hvm_ioreq_server *s)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc;
+
+    spin_lock(&p2m->ioreq.lock);
+
+    if ( flags == 0 )
+    {
+        rc = -EINVAL;
+        if ( p2m->ioreq.server != s )
+            goto out;
+
+        /* Unmap ioreq server from p2m type by passing flags with 0. */
+        p2m->ioreq.server = NULL;
+        p2m->ioreq.flags = 0;
+    }
What does "passing" refer to in the comment?

It means if this routine is called with flags=0, it will unmap the ioreq server.

+    else
+    {
+        rc = -EBUSY;
+        if ( p2m->ioreq.server != NULL )
+            goto out;
+
+        p2m->ioreq.server = s;
+        p2m->ioreq.flags = flags;
+    }
+
+    rc = 0;
+
+ out:
+    spin_unlock(&p2m->ioreq.lock);
+
+    return rc;
+}
+
+struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
+                                              unsigned long *flags)
Again  why "long" and not just "int"?

Got it.  Thanks.

+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    struct hvm_ioreq_server *s;
+
+    spin_lock(&p2m->ioreq.lock);
+
+    s = p2m->ioreq.server;
+    *flags = p2m->ioreq.flags;
+
+    spin_unlock(&p2m->ioreq.lock);
+    return s;
+}
Locking is somewhat strange here: You protect against the "set"
counterpart altering state while you retrieve it, but you don't
protect against the returned data becoming stale by the time
the caller can consume it. Is that not a problem? (The most
concerning case would seem to be a race of hvmop_set_mem_type()
with de-registration of the type.)

Yes. The case you mentioned might happen. But it's not a big deal I guess. If such case happens, the backend driver will receive an io request and can then discard it
if it has just de-registered the mem type.

+void p2m_destroy_ioreq_server(struct domain *d,
+                              struct hvm_ioreq_server *s)
const

@@ -336,6 +336,24 @@ struct p2m_domain {
          struct ept_data ept;
          /* NPT-equivalent structure could be added here. */
      };
+
+    struct {
+        spinlock_t lock;
+        /*
+         * ioreq server who's responsible for the emulation of
+         * gfns with specific p2m type(for now, p2m_ioreq_server).
+         */
+        struct hvm_ioreq_server *server;
+        /*
+         * flags specifies whether read, write or both operations
+         * are to be emulated by an ioreq server.
+         */
+        unsigned int flags;
+
+#define P2M_IOREQ_HANDLE_WRITE_ACCESS HVMOP_IOREQ_MEM_ACCESS_WRITE
+#define P2M_IOREQ_HANDLE_READ_ACCESS  HVMOP_IOREQ_MEM_ACCESS_READ
Is there anything wrong with using the HVMOP_* values directly?

--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -89,7 +89,9 @@ typedef enum {
      HVMMEM_unused,             /* Placeholder; setting memory to this type
                                    will fail for code after 4.7.0 */
  #endif
-    HVMMEM_ioreq_server
+    HVMMEM_ioreq_server        /* Memory type claimed by an ioreq server; type
+                                  changes to this value are only allowed after
+                                  an ioreq server has claimed its ownership */
Missing trailing full stop.

@@ -383,6 +385,37 @@ struct xen_hvm_set_ioreq_server_state {
  typedef struct xen_hvm_set_ioreq_server_state 
xen_hvm_set_ioreq_server_state_t;
  DEFINE_XEN_GUEST_HANDLE(xen_hvm_set_ioreq_server_state_t);
+/*
+ * HVMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
+ *                                      to specific memroy type <type>
+ *                                      for specific accesses <flags>
+ *
+ * For now, flags only accept the value of HVMOP_IOREQ_MEM_ACCESS_WRITE,
+ * which means only write operations are to be forwarded to an ioreq server.
+ * Support for the emulation of read operations can be added when an ioreq
+ * server has such requirement in future.
+ */
+#define HVMOP_map_mem_type_to_ioreq_server 26
+struct xen_hvm_map_mem_type_to_ioreq_server {
+    domid_t domid;      /* IN - domain to be serviced */
+    ioservid_t id;      /* IN - ioreq server id */
+    uint16_t type;      /* IN - memory type */
+    uint16_t pad;
This field does not appear to get checked in the handler.

I am now wondering, how about we remove this pad field and define type as uint32_t?

Thanks
Yu

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.