[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Crash in set_cpu_sibling_map() booting Xen 4.6.0 on Fusion



On Mon, Nov 23, 2015 at 09:10:08AM +0800, Chao Peng wrote:
> On Fri, Nov 20, 2015 at 05:21:11PM -0800, Ed Swierk wrote:
> > The problem is that the index of the socket_cpumask array is derived via
> > cpu_to_socket() from the APIC ID of the processor in a given socket, but
> > the size of the array is computed based on nr_sockets, which is not
> > necessarily equal to the maximum APIC ID.
> > 
> > Sizing the socket_cpumask to MAX_APICS rather than nr_sockets seems safer,
> > though a bit wasteful. I verified that this change fixes the boot crash
> > with 4 or 8 CPUs on VMware Fusion.
> > 
> > --- a/xen/arch/x86/smpboot.c
> > +++ b/xen/arch/x86/smpboot.c
> > @@ -819,7 +819,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> > 
> >      set_nr_sockets();
> > 
> > -    socket_cpumask = xzalloc_array(cpumask_t *, nr_sockets);
> > +    socket_cpumask = xzalloc_array(cpumask_t *, MAX_APICS);
> 
> Just replacing nr_sockets with MAX_APICS can not really solve problem.
> socket_cpumask should always be synchronized with nr_sockets, otherwise
> at least some function will be missing, if not cause panic in another 
> place.
> 
> If possible, I'd suggest you can debug set_nr_sockets(), especially you
> can inspect the following two values for panic case:
> boot_cpu_data.x86_max_cores
> boot_cpu_data.x86_num_siblings

After carefully checked the log, it looks nr_sockets is computed
correctly for your case, instead phys_proc_id is not right. It could be
again caused by bad CPUID information. Therefor you need debug the CPU
detection code which set phys_proc_id.

Thanks,
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.