|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v2 1/3] xen/device-tree: Parse 'cpu-map' node for CPU topology exploration
Hello,
> >> As to the size of the array, it's not quite clear to me whether by doing it
> >> this way (instead of using nr_cpu_ids) we're not setting ourselves up for
> >> trouble.
> >
> > On ARM64 Xen, nr_cpu_ids represents the total number of populated/available
> > CPUs, but unfortunately it cannot be relied upon as the maximum CPU ID.
> >
> > For instance, if a CPU node in the Device Tree has an invalid
> > 'enable-method'
> > property, that CPU ID slot is still consumed during the initial parsing, but
> > the CPU is not counted towards nr_cpu_ids. This can result in a sparse CPU
> > ID
> > allocation where the maximum CPU ID actually exceeds.
> >
> > If we were to use nr_cpu_ids as the array size here, we would risk an
> > out-of-bounds access under such faulty Device Tree configurations. This is
> > why I used "cpumask_last(&cpu_possible_map) + 1U" to ensure the array is
> > large enough to cover the highest allocated CPU ID.
> >
> > Consequently, there might actually be potential bugs in other parts of Xen
> > where nr_cpu_ids is incorrectly assumed to be the upper bound for CPU ID
> > indexing on ARM.
>
> Specifically cpumask_var_t allocations are dimensioned by nr_cpu_ids, and
> all cpumask{,_var}_t accesses (including the cpumask_last() you use above)
> also have bounds checks against nr_cpu_ids (sometimes only in debug builds).
> IOW if there is an issue as you describe it, and if that can happen in
> practice, then this urgently needs fixing on the Arm side. This cannot be an
> excuse to not do the sane thing here.
Okay, I will use 'nr_cpu_ids' as the array size.
And I have to fix the following code.
ARM Xen may possibly create sparse 'cpu_possible_map' and calculates
nr_cpu_ids from the number of bits in it.
xen/common/cpu.c:
unsigned int __read_mostly nr_cpu_ids = NR_CPUS;
xen/arch/arm/setup.c:
void asmlinkage __init noreturn start_xen(unsigned long fdt_paddr)
{
:
smp_init_cpus();
nr_cpu_ids = smp_get_max_cpus();
printk(XENLOG_INFO "SMP: Allowing %u CPUs\n", nr_cpu_ids);
:
}
xen/arch/arm/smpboot.c:
/* maxcpus: maximum number of CPUs to activate. */
static unsigned int __initdata max_cpus;
integer_param("maxcpus", max_cpus);
unsigned int __init smp_get_max_cpus(void)
{
unsigned int i, cpus = 0;
if ( ( !max_cpus ) || ( max_cpus > nr_cpu_ids ) )
max_cpus = nr_cpu_ids;
for ( i = 0; i < max_cpus; i++ )
if ( cpu_possible(i) )
cpus++;
return cpus;
}
static void __init dt_smp_init_cpus(void)
{
:
:
for ( i = 0; i < cpuidx; i++ )
{
if ( tmp_map[i] == MPIDR_INVALID )
continue;
cpumask_set_cpu(i, &cpu_possible_map);
cpu_logical_map(i) = tmp_map[i];
}
}
> >>> + if ( !cpu_topology )
> >>> + panic("Failed to allocate memory for cpu_topology array\n");
> >>
> >> I question such uses of panic(): Surely we can do without any NUMA info,
> >> it's only performance which is going to suffer.
> >
> > Okay, I will replace the panic() with a XENLOG_WARNING printk.
>
> Which of course you understand isn't all that needs changing then.
Yes, I will ensure that the rest of the code handles a NULL 'cpu_topology'
Pointer.
Hirokazu Takahashi.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |