[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/nmi: lower initial watchdog frequency to avoid boot hangs
>>>> If the actual SMI source is not related to some place in the NMI >>>> handler code but was eg. due to some SMI timer, lowering NMI >>>> watchdog frequency might not fix the issue completely, but lower >>>> its reproducibility (perhaps to some very rare occurrences). So >>>> it's better be sure what was the real source of SMI. >>>> >>> >>> This *is* related to this instruction - it was confirmed >>> empirically. Removing this instruction stops SMIs from occurring >>> and effectively removes the issue leaving the frequency unchanged. >> >> Hmm, it would be interesting to know for what evil purpose does it >> need to trap I/O port 61h. >> BTW, on which motherboard model the issue was reproduced? >> > >The issue has been reported for some Dell/Huawei Skylake platforms (one >of them PowerEdge R740 to be precise) but I don't think the others are >unaffected (the issue supposedly originates from Intel's reference >code) >- the default BIOS setup indeed matters. Here is a bit of info you might find useful. I did a quick research on my test system (Gigabyte GA-H270M-D3H) in order to confirm if BIOS traps I/O port 61h (NMI status) and for what purposes. Well, turns out it really does. Moreover, it's actually the only fixed I/O port location trapped by SMI I/O traps on this system. Few others are simply 'allocated' ones, meaning the real I/O port address being trapped is chosen dynamically by supplying Address=0 to a corresponding call to EFI I/O Trap interface function -- such I/O traps may be used as interfaces with a SMI handler in a manner similar to the SW SMI interface. The EFI module responsible for installing port 61h SMI I/O Trap is PchInitSmm in my case. The related code is: ... mov eax, 61h lea r9, qword_5778 mov [rsp+98h+io_trap_ctx.io_address], ax mov rax, cs:pIoTrapIF lea r8, [rsp+98h+io_trap_ctx] lea rdx, Port61h_IoTrapHandler mov rcx, rax mov [rsp+98h+io_trap_ctx.trap_type], ebp ; trap reads mov [rsp+98h+io_trap_ctx.io_len], bp ; ebp=1 call qword ptr [rax] ... The actual handler (named Port61h_IoTrapHandler in the above code) is fairly lightweight and does a bit of useless black magic. First, there is a loop for all CPUs which finds which CPU actually caused trapped I/O operation by reading NMI status port. Then it reads the original port 61h value and set NMI_SC bit4 to its inverted previous state for the selected CPU' bit. And then updated AL register value is returned to the NMI_SC-reading user code (via patching RAX register value in SMRAM saved state): ; ebp = 61h, rbx = CPU index ... mov edx, ebp in al, dx mov r8, cs:bmNmiRefTogglesForCpus mov rcx, rbx mov edx, 1 shl edx, cl mov r9, rbx movsxd rcx, edx mov dl, al and al, 0EFh xor r8, rcx or dl, 10h mov cs:bmNmiRefTogglesForCpus, r8 and r8, rcx movzx ecx, al movzx eax, dl test r8, r8 mov edx, 1 cmovnz ecx, eax lea rax, [rsp+58h+al_to_return] lea r8d, [rdx+25h] ; EFI_SMM_SAVE_STATE_REGISTER_RAX mov [rsp+58h+func_arg0], rax mov rax, cs:pEFI_SMM_CPU_PROTOCOL_GUID_IF mov [rsp+58h+al_to_return], cl mov rcx, rax call qword ptr [rax+8] ; WriteSaveState ... So, the only purpose of this stuff is emulating REF_TOGGLE bit toggling logic (simply by alternating ones and zeros on each NMI_SC read), nothing more. Sort of workaround for some legacy code which depends on REF_TOGGLE rolling (which is now being marked Reserved in docs). On this particular system SMI I/O trap for port 61h neither do anything time-consuming nor anything really useful. That Dell system must have something similar (thanks to common EFI ref code from Intel Igor mentioned), leaving the question why port 61h reading is so slow there. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |