[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 02/21] xen: make two memory hypercalls vNUMA-aware



On Fri, Jan 23, 2015 at 03:43:07PM +0000, Wei Liu wrote:
> On Fri, Jan 23, 2015 at 03:37:51PM +0000, Jan Beulich wrote:
> > >>> On 23.01.15 at 15:46, <wei.liu2@xxxxxxxxxx> wrote:
> > > On Fri, Jan 23, 2015 at 01:16:19PM +0000, Jan Beulich wrote:
> > >> >>> On 23.01.15 at 12:13, <wei.liu2@xxxxxxxxxx> wrote:
> > >> > Make XENMEM_increase_reservation and XENMEM_populate_physmap
> > >> > vNUMA-aware.
> > >> > 
> > >> > That is, if guest requests Xen to allocate memory for specific vnode,
> > >> > Xen can translate vnode to pnode using vNUMA information of that guest.
> > >> > 
> > >> > XENMEMF_vnode is introduced for the guest to mark the node number is in
> > >> > fact virtual node number and should be translated by Xen.
> > >> > 
> > >> > XENFEAT_memory_op_vnode_supported is introduced to indicate that Xen is
> > >> > able to translate virtual node to physical node.
> > >> > 
> > >> > Signed-off-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> > >> > Acked-by: Jan Beulich <JBeulich@xxxxxxxx>
> > >> 
> > >> I'm afraid there's another change needed for this to hold:
> > >> 
> > >> > --- a/xen/common/memory.c
> > >> > +++ b/xen/common/memory.c
> > >> > @@ -692,6 +692,50 @@ out:
> > >> >      return rc;
> > >> >  }
> > >> >  
> > >> > +static int translate_vnode_to_pnode(struct domain *d,
> > >> > +                                    struct xen_memory_reservation *r,
> > >> > +                                    struct memop_args *a)
> > >> > +{
> > >> > +    int rc = 0;
> > >> > +    unsigned int vnode, pnode;
> > >> > +
> > >> > +    /*
> > >> > +     * Note: we don't strictly require non-supported bits set to zero,
> > >> > +     * so we may have exact_vnode bit set for old guests that don't
> > >> > +     * support vNUMA.
> > >> > +     *
> > >> > +     * To distinguish spurious vnode request v.s. real one, check if
> > >> > +     * d->vnuma exists.
> > >> > +     */
> > >> > +    if ( r->mem_flags & XENMEMF_vnode )
> > >> > +    {
> > >> > +        read_lock(&d->vnuma_rwlock);
> > >> > +        if ( d->vnuma )
> > >> 
> > >> if r->mem_flags has XENMEMF_vnode set but d->vnuma is NULL,
> > >> you need to clear the node from the flags.
> > >> 
> > > 
> > > As said in the comment, we don't seem to enforce non-supported bits set
> > > to zero (IIRC you told me that). So an old guest that sets XENMEMF_vnode
> > > by accident will get its other flags cleared if I follow your suggestion.
> > 
> > Which is an acceptable thing to do I think - they called for
> > undefined behavior, and they now get unexpected behavior.
> > Mistaking the virtual node specified for a physical one is certainly
> > less desirable.
> > 
> 
> OK, thanks for clarification.
> 

So the logic of translation now is (take into consideration the second
point of how we should enforce exact_node flag, I think that flag should
be preserved if it was requested at the beginning):

+static int translate_vnode_to_pnode(struct domain *d,
+                                    struct xen_memory_reservation *r,
+                                    struct memop_args *a)
+{
+    int rc = 0;
+    unsigned int vnode, pnode;
+
+    if ( r->mem_flags & XENMEMF_vnode )
+    {
+        a->memflags &= ~MEMF_node(XENMEMF_get_node(r->mem_flags));
+        a->memflags &= ~MEMF_exact_node;
+
+        read_lock(&d->vnuma_rwlock);
+        if ( d->vnuma )
+        {
+            vnode = XENMEMF_get_node(r->mem_flags);
+
+            if ( vnode < d->vnuma->nr_vnodes )
+            {
+                pnode = d->vnuma->vnode_to_pnode[vnode];
+
+                if ( pnode != NUMA_NO_NODE )
+                {
+                    a->memflags |= MEMF_node(pnode);
+                    if ( r->mem_flags & XENMEMF_exact_node_request )
+                        a->memflags |= MEMF_exact_node;
+                }
+            }
+            else
+                rc = -EINVAL;
+        }
+        read_unlock(&d->vnuma_rwlock);
+    }
+
+    return rc;
+}

> Wei.
> 
> > Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.