[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2 1/2] linux/vnuma: vNUMA for PV domu guest



On 18/09/13 07:16, Elena Ufimtseva wrote:
> On Tue, Sep 17, 2013 at 10:10 AM, David Vrabel <david.vrabel@xxxxxxxxxx> 
> wrote:
>> On 17/09/13 09:34, Elena Ufimtseva wrote:
>>> Requests NUMA topology info from Xen by issuing subop
>>> hypercall. Initializes NUMA nodes, sets number of CPUs,
>>> distance table and NUMA nodes memory ranges during boot.
>>> vNUMA topology defined by user in VM config file. Memory
>>> ranges are represented by structure vnuma_topology_info
>>> where start and end of memory area are defined in guests
>>> pfns numbers, constructed and aligned accordingly to
>>> e820 domain map.
>>> In case the received structure has errors, will fail to
>>> dummy numa init.
>>> Requires XEN with applied patches from vnuma patchset;
>>>
>>> Changes since v1:
>>> - moved the test for xen_pv_domain() into xen_numa_init;
>>> - replaced memory block search/allocation by single memblock_alloc;
>>> - moved xen_numa_init to vnuma.c from enlighten.c;
>>> - moved memblock structure to public interface memory.h;
>>> - specified signedness of vnuma topology structure members;
>>> - removed excessive debug output;
>>>
>>> TODO:
>>> - consider common interface for Dom0, HVM and PV guests to provide
>>> vNUMA topology;
>>> - dynamic numa balancing at the time of this patch (kernel 3.11
>>> 6e4664525b1db28f8c4e1130957f70a94c19213e with boot parameter
>>> numa_balancing=true that is such by default) crashes numa-enabled
>>> guest. Investigate further.
>>
>>> --- a/arch/x86/mm/numa.c
>>> +++ b/arch/x86/mm/numa.c
>>> @@ -19,6 +19,7 @@
>>>  #include <asm/amd_nb.h>
>>
>> #include <asm/xen/vnuma.h> here...
>>
>>>  #include "numa_internal.h"
>>> +#include "asm/xen/vnuma.h"
>>
>> ... not here.
>>
>>> --- /dev/null
>>> +++ b/arch/x86/xen/vnuma.c
>>> @@ -0,0 +1,92 @@
>>> +#include <linux/err.h>
>>> +#include <linux/memblock.h>
>>> +#include <xen/interface/xen.h>
>>> +#include <xen/interface/memory.h>
>>> +#include <asm/xen/interface.h>
>>> +#include <asm/xen/hypercall.h>
>>> +#include <asm/xen/vnuma.h>
>>> +#ifdef CONFIG_NUMA
>>> +/* Xen PV NUMA topology initialization */
>>> +static unsigned int xen_vnuma_init = 0;
>>> +int xen_vnuma_support()
>>> +{
>>> +     return xen_vnuma_init;
>>> +}
>>
>> I'm not sure how this and the usage in the next patch actually work.
>> xen_vnuma_init is only set after the test of numa_off prior to calling
>> xen_numa_init() which will set xen_vnuma_init.
> 
> David, its obscure and naming is not self explanatory.. Will fix it.
> But the idea was to make sure
> that NUMA can be safely turned on (for domu domain and if
> xen_numa_init call was sucessfull).

I understand what it's for, I just don't see how it works.

The code path looks like (I think):

xen_vnuma_init = 0;

if (!xen_vnuma_init)
    numa_off = 1

if (!numa_off)
    xen_numa_init()

However, if you go with the idea of calling dummy init in
xen_num_init()'s error path you don't need this.


>>> +     for (i = 0; i < numa_topo.nr_nodes; i++) {
>>> +             if (numa_add_memblk(i, varea[i].start, varea[i].end))
>>> +                     /* pass to numa_dummy_init */
>>> +                     goto vnumaout;
>>
>> If there's a failure here, numa may be partially setup.  Do you need to
>> undo any of the bits that have already setup?
> 
> Konrad asked me the same and I was under impression it is safe. But
> that was based on assumptions
> what I would rather avoid making.  I will add bits to unset numa in
> case of failure.

I looked at the other uses of this and none of them undo on failure so I
think it is fine as is.

>>> +     if (phys)
>>> +             memblock_free(__pa(phys), mem_size);
>>> +     if (physd)
>>> +             memblock_free(__pa(physd), dist_size);
>>> +     if (physc)
>>> +             memblock_free(__pa(physc), cpu_to_node_size);
>>> +     return rc;
>>
>> If you return an error, x86_numa_init() will try to call setup for other
>> NUMA system.  Consider calling numa_dummy_init() directly instead and
>> then returning success.
> 
> David, isnt it what x86_numa_init() supposed to do? try every
> *numa_init until one succeed?
> Will adding excplicit call to dummy numa from xen_init_numa brake this logic?

Yes, but if we know we're a PV guest we do not want to try any other
one, we want to fallback to the dummy init immediately.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.