[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 07/10] domain: map/unmap GADDR based shared guest areas



Hi Jan,

I am still digesting this series and replying with some initial comments.

On 19/10/2022 09:43, Jan Beulich wrote:
The registration by virtual/linear address has downsides: At least on
x86 the access is expensive for HVM/PVH domains. Furthermore for 64-bit
PV domains the areas are inaccessible (and hence cannot be updated by
Xen) when in guest-user mode.

In preparation of the introduction of new vCPU operations allowing to
register the respective areas (one of the two is x86-specific) by
guest-physical address, flesh out the map/unmap functions.

Noteworthy differences from map_vcpu_info():
- areas can be registered more than once (and de-registered),
- remote vCPU-s are paused rather than checked for being down (which in
   principle can change right after the check),
- the domain lock is taken for a much smaller region.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
RFC: By using global domain page mappings the demand on the underlying
      VA range may increase significantly. I did consider to use per-
      domain mappings instead, but they exist for x86 only. Of course we
      could have arch_{,un}map_guest_area() aliasing global domain page
      mapping functions on Arm and using per-domain mappings on x86. Yet
      then again map_vcpu_info() doesn't do so either (albeit that's
      likely to be converted subsequently to use map_vcpu_area() anyway).

RFC: In map_guest_area() I'm not checking the P2M type, instead - just
      like map_vcpu_info() - solely relying on the type ref acquisition.
      Checking for p2m_ram_rw alone would be wrong, as at least
      p2m_ram_logdirty ought to also be okay to use here (and in similar
      cases, e.g. in Argo's find_ring_mfn()). p2m_is_pageable() could be
      used here (like altp2m_vcpu_enable_ve() does) as well as in
      map_vcpu_info(), yet then again the P2M type is stale by the time
      it is being looked at anyway without the P2M lock held.

--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1563,7 +1563,82 @@ int map_guest_area(struct vcpu *v, paddr
                     struct guest_area *area,
                     void (*populate)(void *dst, struct vcpu *v))
  {
-    return -EOPNOTSUPP;
+    struct domain *currd = v->domain;
+    void *map = NULL;
+    struct page_info *pg = NULL;
+    int rc = 0;
+
+    if ( gaddr )

0 is technically a valid (guest) physical address on Arm.

+    {
+        unsigned long gfn = PFN_DOWN(gaddr);

This could be gfn_t for adding some type safety.

+        unsigned int align;
+        p2m_type_t p2mt;
+
+        if ( gfn != PFN_DOWN(gaddr + size - 1) )
+            return -ENXIO;
+
+#ifdef CONFIG_COMPAT
+        if ( has_32bit_shinfo(currd) )
+            align = alignof(compat_ulong_t);
+        else
+#endif
+            align = alignof(xen_ulong_t);
+        if ( gaddr & (align - 1) )
+            return -ENXIO;
+
+        rc = check_get_page_from_gfn(currd, _gfn(gfn), false, &p2mt, &pg);
+        if ( rc )
+            return rc;
+
+        if ( !get_page_type(pg, PGT_writable_page) )
+        {
+            put_page(pg);
+            return -EACCES;
+        }
+
+        map = __map_domain_page_global(pg);
+        if ( !map )
+        {
+            put_page_and_type(pg);
+            return -ENOMEM;
+        }
+        map += PAGE_OFFSET(gaddr);
+    }
+
+    if ( v != current )
+    {
+        if ( !spin_trylock(&currd->hypercall_deadlock_mutex) )
+        {
+            rc = -ERESTART;
+            goto unmap;
+        }
+
+        vcpu_pause(v);

AFAIU, the goal of vcpu_pause() here is to guarantee that the "area" will not be touched by another pCPU. However, looking at the function context_switch() we have:

sched_context_switched(prev, next);
_update_runstate_area();

The first function will set v->is_running to false (see vcpu_context_saved()). So I think the "area" could be touched even afte vcpu_pause() is returned.

Therefore, I think we will need _update_runstate_area() (or update_runstate_area()) to be called first.

+
+        spin_unlock(&currd->hypercall_deadlock_mutex);
+    }
+
+    domain_lock(currd);
+
+    if ( map )
+        populate(map, v);
+
+    SWAP(area->pg, pg);
+    SWAP(area->map, map);
+
+    domain_unlock(currd);
+
+    if ( v != current )
+        vcpu_unpause(v);
+
+ unmap:
+    if ( pg )
+    {
+        unmap_domain_page_global(map);
+        put_page_and_type(pg);
+    }
+
+    return rc;
  }
/*
@@ -1573,6 +1648,22 @@ int map_guest_area(struct vcpu *v, paddr
   */
  void unmap_guest_area(struct vcpu *v, struct guest_area *area)
  {
+    struct domain *d = v->domain;
+    void *map;
+    struct page_info *pg;

AFAIU, the assumption is the vCPU should be paused here. Should we add an ASSERT()?

+
+    domain_lock(d);
+    map = area->map;
+    area->map = NULL;
+    pg = area->pg;
+    area->pg = NULL;
+    domain_unlock(d);
+
+    if ( pg )
+    {
+        unmap_domain_page_global(map);
+        put_page_and_type(pg);
+    }
  }
int default_initialise_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)


Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.