[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs

On 07/02/18 15:06, Jan Beulich wrote:
>>>> On 07.02.18 at 14:24, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 07/02/18 13:08, Jan Beulich wrote:
>>>>>> On 07.02.18 at 14:01, <igor.druzhinin@xxxxxxxxxx> wrote:
>>>> So far the issue confirmed:
>>>> Dell PowerEdge R740, Huawei systems based on Xeon Gold 6152 (the one
>>>> that it was tested on), Intel S2600XX, etc.
>>>> Also see:
>>>> https://bugs.xenserver.org/browse/XSO-774 
>>>> Well, no-watchdog is what we currently recommend in that case but we
>>>> hoped there is a general solution here from Xen side. You have your
>>>> point that they should fix this on their side because it's their fault
>>>> indeed. But the user experience is also important for us I think.
>>> Of course, hence the suggestion of possible alternative workarounds.
>>> Impacting everyone is, as said, not a desirable approach in a case
>>> like this one. I also continue to dislike the seemingly random division
>>> by 10.
>> Xen's usability is crap, which is in large part due to attitude like
>> this.  It is not ok to expect the end user to know diagnose/debug issues
>> like this, and it is entirely unreasonable to expect the end user to
>> have to manually work around it.
> Excuse me? The watchdog is off by default. Anyone turning it on
> ought to know what they do. You (iirc) turning it on unilaterally in
> XenServer puts the burden of avoidng users to have to diagnose
> the issue on you.

And we have taken the burden of diagnosing the issue, as well as
proposing a fix.

>> This particular issue does want feeding back to Intel so they can try
>> and fix it, but whatever is wrong is present in a large amount of
>> Skylake systems in the field.  Xen needs to be able to cope.
> But in a reasonable way.
>> Finally, as to boot times, your argument is backwards seeing as you care
>> about elapsed boot time.  Slowing the frequency will speed everything
>> up, as we aren't executing a large chunk of the BSP boot path with 100hz
>> NMI constantly interrupting.
> How long does handling a single NMI take? Microseconds, I assume.
> Contrast this with waiting for two NMIs to arrive in wait_for_nmis(),
> which goes up from 20ms to 200ms with this change.

So you're argument is to not change the frequency because an
off-by-default option will *in the best case* add a few hundred
milliseconds extra to the boot time?  Times to boot computers are
measured in minutes, not milliseconds.

I don't know how long servicing an NMI takes, at a minimum of a rdmsr,
wrmsr and then a further mmio write or wrmsr, I doubt it is that quick.

> Also you completely ignore my argument against the seemingly
> random division by 10, including the resulting question of what you
> mean to do once 10Hz also turns out too high a frequency.

We've got to pick a frequency.  The current 100Hz is just as arbitrary
as the proposed new 10Hz.

> I wouldn't, btw, mind an attempt to avoid the high rate NMIs
> during early boot (if those occur in the first place, which from
> two successive replies by Igor yesterday I wasn't sure anymore
> is an actual fact), but that's independent of the issue at hand.

The 100Hz NMI is active from BSP APIC init, IO-APIC, deadline timer
calibration, mwait idle, the entirety of HVM setup and full AP bringup. 
On one of my fastest boxes, it is about 1 second of wallclock time.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.