[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/2] xen/common: Add NUMA node id bounds check to page_alloc.c/node_to_scrub



On 9/26/23 3:54 PM, Andrew Cooper wrote:
> On 26/09/2023 7:54 pm, Shawn Anastasio wrote:
>> When building for Power with CONFIG_DEBUG unset, a compiler error gets
>> raised inside page_alloc.c's node_to_scrub function:
>>
>> common/page_alloc.c: In function 'node_to_scrub.part.0':
>> common/page_alloc.c:1217:29: error: array subscript 1 is above array
>>             bounds of 'long unsigned int[1]' [-Werror=array-bounds]
>>  1217 |         if ( node_need_scrub[node] )
>>
>> It appears that this is a false positive, given that in practice
>> cycle_node should never return a node ID >= MAX_NUMNODES as long as the
>> architecture's node_online_map is properly defined and initialized, so
>> this additional bounds check is only to satisfy GCC.
> 
> If GCC thinks it's got an index of 1 here, then it thinks it's proved
> that a 1 gets passed.  But the fact that cycle_node() really can return
> 1 if one variable gets tweaked in memory means that logic derived on
> this property is broken.
> 
> But we've got even more basic problems to fix first.
> 
> xen$ git grep -B1 '__read_mostly node_online_map'
> arch/arm/smpboot.c-45-/* Fake one node for now. See also asm/numa.h */
> arch/arm/smpboot.c:46:nodemask_t __read_mostly node_online_map = { { [0]
> = 1UL } };
> arch/ppc/stubs.c-25-
> arch/ppc/stubs.c:26:nodemask_t __read_mostly node_online_map = { { [0] =
> 1UL } };
> common/numa.c-44-
> common/numa.c:45:nodemask_t __read_mostly node_online_map = { { [0] =
> 1UL } };
> 
> There are 3 identical definitions of node_online_map, one for the
> architecture which really supports NUMA, and two for the stubs which don't.
> 
> And the bug here is that code outside of CONFIG_NUMA assumes NUMA is
> valid, including several places in page_alloc.c,
> domain_set_node_affinity(), credit1 and sysctl.  XEN_SYSCTL_numainfo
> even has a bigger sin of using a static MAX_NUMNODES array when it
> doesn't need an array in the first place.
> 

Agreed. It seems like the current state of affairs resulted from hacks
in Arm to work around NUMA assumptions in common instead of properly
making common support !CONFIG_NUMA. And now ppc is inheriting some of
that baggage.

> It's rude for xen/nodemask.h to short circuit some of the operations on
> node_online_map based on MAX_NUMNODES but not others.
> 
> If nothing else, the fallback for "not really NUMA" needs providing by
> the common code and not by the arch stubs.  When that is sorted, we
> might have more consistent behaviour to investigate.
>

I'll leave these more sweeping changes up to someone with a bit more
familiarity with this code for now. In the meantime, this patch at
should introduce no functional change to the status quo and allows ppc
to build in the interim.

> ~Andrew

Thanks,
Shawn



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.