[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.12.0-rc Hangs Around masked ExtINT on CPU#



>>> On 16.04.19 at 18:23, <jlpoole56@xxxxxxxxx> wrote:

> On 3/27/2019 7:21 AM, Jan Beulich wrote:
>>>>> On 27.03.19 at 14:25, <jlpoole56@xxxxxxxxx> wrote:
>>> On 3/27/2019 1:14 AM, Jan Beulich wrote:
>>>>>>> On 26.03.19 at 18:21, <jlpoole56@xxxxxxxxx> wrote:
>>>>> zeta /usr/local/src/xen # cat xen/.config |grep CONFIG_HVM
>>>>> # CONFIG_HVM is not set
>>>>> zeta /usr/local/src/xen #
>>>>>
>>>>> # tried 2 boot attempts
>>>>> log at: https://pastebin.com/nL4BWJ6Y 
>>>>>
>>>>> Hang points at lines:
>>>> Thanks for trying anyway; one further possibility eliminated. Looking
>>>> at the logs I've had another thought (wild guess again, so not really
>>>> much hope): Could you try "mwait-idle=no"?
>>>>
>>> I modified man_xen.cfg by adding at the end the kernel parameter:
>>>
>>> mwait-idle=no
>>>
>>> Rebooted.
>>> Result: hung:
>> Thanks. I'm afraid I'm out of ideas for the moment.
>>
>> Jan
>>
>>
> Jan,
> 
> Recall, the Xen kernel successfully launched in 2017 when I first built
> Xen in Gentoo, that was about version 4.7.1.  I had to launch it
> from an EFI console.  I've tried to revert back to 4.7.1 and
> build a kernel and I have found it too difficult as certain
> dependencies have since been removed from Gentoo.
> 
> I've been studying apic.c and the differences between 4.7.1
> and HEAD.  Here's a DIFF:
> 
> http://quickdiff.net/?unique_id=948598C4-31A2-D028-CE95-F04632C1871A 
> Create a new one? <https://quickdiff.net/>
> 
> I see that currently there is a structure:
> 
> static const struct x86_cpu_id __initconstrel deadline_match[] = {
> 
> which identifies the microarchitecture, e.g. Haswell, Skylake.
> 
> https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/apic.c;h=2a2 
> 432619e3edce2cdbc275abbd4e80ffcdcd9f0;hb=HEAD#l1146
> 
> 
> Line 1176 has a return if there is a failure to match, yet further down 
> if there
> is a version mismatch, there is a XENLOG_WARNING:
> 
> TSC_DEADLINE disabled due to Errata;
> please update microcode to version %#x (or later)
> 
> 
> My serveris Atom based, a Supermicro A1SAi-2750F
> https://www.supermicro.com/products/motherboard/Atom/X10/A1SAi-2750F.cfm 
> which has an Intel® Atom™ Processor C2750.
> 
> https://ark.intel.com/content/www/us/en/ark/products/77987/intel-atom-proces 
> sor-c2750-4m-cache-2-40-ghz.html
> 
> I believe my CPU chip is from a 22 nanometer fabrication process and
> Wikipedia tells me that accordingly, the microarchitecture is Silvermont.
> https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures#Atom_Line 
> s
> 
> Moreover, the Intel documentation confirms this:
> 
> "2.1.15 The Intel® Atom™ Processor Family Based on Silvermont 
> Microarchitecture (2013)
> Intel Atom Processor C2xxx, E3xxx, S1xxx series are based on the
> Silvermont microarchitecture. Processors based on the Silvermont
> microarchitecture supports instruction set extensions up to and
> including SSE4.2, AESNI, and PCLMULQDQ."  from page 2-5 of the
> Software Developer’s Manual (below).
> 
> I'm wondering if the fact that I was able to boot a kernel under Xen 2.4.7
> and the unexplained hanging at boot for 4.7.12+ is related to the fact 
> that the
> Silvermont architecture is not accounted from in the deadline structure.
> 
> sheet 784-5 states that bit 24 "TSC-Deadline" with this description:
> "A value of 1 indicates that the processor’s local APIC timer supports
> one-shot operation using a TSC deadline value."
> 
> from the Intel® 64 and IA-32 Architectures
> Software Developer’s Manual
> Combined Volumes:
> 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4
> 
> at:
> https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol- 
> 1-2abcd-3abcd.pdf
> 
> Is any of the above bird-dogging helpful or cause you to have an
> "Ahah!" moment?

Not really, no. It's not clear to me where the TSC deadline
connection comes from, but if you suspect something then it
would be helpful if you pointed out the respective erratum
for the specific CPU model you use, or if you simply
suppressed use of the deadline timer by using the respective
command line option ("tdt=0").

Beyond that I'm afraid it'll take someone else to have an idea,
or someone to be able to actually debug the issue on a system
where the issue surfaces.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.