[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Crash in set_cpu_sibling_map() booting Xen 4.6.0 on Fusion

>>> On 23.11.15 at 17:36, <eswierk@xxxxxxxxxxxxxxxxxx> wrote:
> I instrumented detect_extended_topology() and ran again with 4 CPUs.
> (XEN) smp_store_cpu_info id=3
> (XEN) detect_extended_topology cpuid_count op=0xb count=0 eax=0x0 ebx=0x1 
> ecx=0x100 edx=0x6
> (XEN) detect_extended_topology initial_apicid=6 core_plus_mask_width=0 
> core_level_siblings=1
> (XEN) detect_extended_topology cpuid_count op=0xb count=1 eax=0x0 ebx=0x1 
> ecx=0x201 edx=0x6
> (XEN) detect_extended_topology ht_mask_width=0 core_plus_mask_width=0 
> core_select_mask=0x0 core_level_siblings=1
> If cpuid 0xb returned 1 rather than 0 in eax[4:0], we would get
> consecutively-numbered physical processor IDs.
> But the only requirement I see in the IA SDM (vol 2A, table 3-17) is that
> the eax[4:0] value yield unique IDs, not necessarily consecutive. Likewise
> while the examples in vol 3A sec 8.9 show physical IDs numbered
> consecutively, the algorithms do not assume this is the case.

Indeed, and I think I had said so. The algorithm does, however, tell
us that with the above output CPU 3 (APIC ID 6) is on socket 6 (both
shifts being zero), which for the whole system results in sockets 1,
3, and 5 unused. While not explicitly excluded, I'm not sure how far
we should go in expecting all kinds of odd configurations (along those
lines we e.g. have a limit on the largest APIC ID we allow: MAX_APICS /
MAX_LOCAL_APIC, which for big systems is 4 times the number of
CPUs we support).

Taking it to set_nr_sockets(), a pretty basic assumption is broken by
the above way of presenting topology: We would have to have more
sockets than there are CPUs. I would have wanted to check what
e.g. Linux does here, but there doesn't seem to be any support of
CAT (and hence any need for per-socket data) there.

(I am, btw, now also confused by you saying that e.g. for a 3-CPU
config things work. If the topology data gets presented in similar
ways in that case, I can't see why you wouldn't run into the same
problem. Unless memory corruption occurs silently in one case, but
"loudly" in the other.)

Bottom line - for the moment I do not see a reasonable way of
dealing with that situation. The closest I could see would be what
we iirc had temporarily during the review cycles of the initial CAT
series: A command line option to specify the number of sockets. Or
make all accesses to socket_cpumask[] conditional upon PSR being
enabled (which would have the bad side effect of making future
uses for other purposes more cumbersome), or go through and
range check the socket number on all of those accesses.

Chao, could you - inside Intel - please check whether there are
any assumptions on the respective CPUID leaf output that aren't
explicitly stated in the SDM right now (like resulting in contiguous
socket numbers), and ask for them getting made explicit (if there
are any), or it being made explicit that no assumptions at all are
to be made at all on the presented values (in which case we'd
have to consume MADT parsing data in set_nr_sockets(), e.g.
by replacing num_processors there with one more than the
maximum APIC ID of any non-disabled CPU)?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.