|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Minios-devel] [UNIKRAFT PATCH v2 4/9] plat: check for and enable extended CPU features
Hi Yuri, On 12/17/18 5:02 PM, Yuri Volchkov wrote: Florian Schmidt <florian.schmidt@xxxxxxxxx> writes:But only do this if code is compiled with SSE/AVX. Signed-off-by: Florian Schmidt <florian.schmidt@xxxxxxxxx> --- plat/common/include/x86/cpu_defs.h | 22 ++++++++++ plat/kvm/x86/entry64.S | 58 +++++++++++++++++++++---- plat/kvm/x86/setup.c | 15 ------- plat/xen/x86/entry64.S | 68 +++++++++++++++++++++++++++--- plat/xen/x86/setup.c | 15 ------- 5 files changed, 135 insertions(+), 43 deletions(-) diff --git a/plat/common/include/x86/cpu_defs.h b/plat/common/include/x86/cpu_defs.h index 9ecec967..78821b52 100644 --- a/plat/common/include/x86/cpu_defs.h +++ b/plat/common/include/x86/cpu_defs.h @@ -58,6 +58,7 @@ */ #define X86_CR0_MP (1 << 1) /* Monitor Coprocessor */ #define X86_CR0_EM (1 << 2) /* Emulation */ +#define X86_CR0_TS (1 << 2) /* Task Switched */Did you mean 1 << 3? Uh... right. It didn't surface as a bug immediately because I only use it once to reset a field (to be safe) that comes out of the initial CPU state after booting as 0 already. Gonna fix that. #define X86_CR0_NE (1 << 5) /* Numeric Exception */ #define X86_CR0_PG (1 << 31) /* Paging */@@ -67,10 +68,31 @@#define X86_CR4_PAE (1 << 5) /* enable PAE */ #define X86_CR4_OSFXSR (1 << 9) /* OS support for FXSAVE/FXRSTOR */ #define X86_CR4_OSXMMEXCPT (1 << 10) /* OS support for FP exceptions */ +#define X86_CR4_FSGSBASE (1 << 16) /* enable FSGSBASE*/ +#define X86_CR4_OSXSAVE (1 << 18) /* enable XSAVE, extended states *//** Intel CPU features in EFER */ #define X86_EFER_LME (1 << 8) /* Long mode enable (R/W) */+/* CPUID feature bits in ECX and EDX when EAX=1 */+#define X86_CPUID1_ECX_XSAVE (1 << 26) +#define X86_CPUID1_ECX_OSXSAVE (1 << 27) +#define X86_CPUID1_ECX_AVX (1 << 28) +#define X86_CPUID1_EDX_FPU (1 << 0) +#define X86_CPUID1_EDX_FXSR (1 << 24) +#define X86_CPUID1_EDX_SSE (1 << 25) +/* CPUID feature bits in EBX and ECX when EAX=7 */ECX is in charge of the sub-leaf, maybe add this in comment as you did for the next one? Will do. +#define X86_CPUID7_EBX_FSGSBASE (1 << 0) +/* CPUID feature bits when EAX=0xd, EXC=1 */Typo EXC instead of ECX Yep. +#define X86_CPUIDD1_EAX_XSAVEOPT (1<<0) + +/* + * Extended Control Register 0 (XCR0) + */ +#define X86_XCR0_X87 (1 << 0) +#define X86_XCR0_XMM (1 << 1) +#define X86_XCR0_YMM (1 << 2)As far as I understand, the last two are enabling SSE and AVX. Maybe we name them respectively, instead of XMM and YMM? I originally named them like that because they govern whether XMM and YMM registers are available and can be saved by XSAVE. But now that I double-checked, I noticed the Intel documentation names them XCR0.SSE and XCR0.AVX, so I agree we should just go with that.
We could do that and add a configuration option to unikraft that globally disables all floating-point instructions, similar to how the Linux kernel does it. In that case, we could indeed skip pretty much all this code. However, I'm a bit worried about the implications of this, because that most likely means we'll run into problems with full-fledged libc implementations such as newlib or musl. (or at least with the libm part). I think in general, this is an interesting idea, and we could check adding such a mode to unikraft, because I could think of situations where we might not need any floating point operations. However, I would rather make this a future feature patch, so that we can continue with this series that is more of a bug fix (though a pretty large one). However, now that I think about it, maybe instead of just putting a warning in the trap handler (in the last patch of the series), it would make sense to compile trap.c and the time.c files that contain the timer interrupt handler with how Linux does it: "-mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -mno-fp-ret-in-387 -mskip-rax-setup" (looking at arch/x86/Makefile). As an aside, does anyone know why it lists -mno-sse2 on top of -mno-sse? Does -mno-sse not imply -mno-sse2? And if so, why do they not also put -mno-sse-3, -mno-ssse3, ... in their CFLAGS?
Ah, these flags only set information in the extended capabilities register XCR0 (via the XSETBV that follows). These flags just signal which information is available and should be saved via the XSAVE instruction. That instruction itself was only added well after SSE, so the assumption here is that, if you have XSAVE available, then you can save SSE's XMM registers. As far as I remember (but to be honest, I can't find the information right now), XSAVE was only introduced with AVX (before that, there was FXSAVE for SSE support on CPUs with XMM registers, but no YMM registers), so setting those two flags means "XSAVE, please save both SSE and AVX registers for me", and there's no point in ever using xsave for SSE only or for AVX only. With later extensions (AVX512 especially), we get ZMM registers, and then we need to set another flag in XCR0 so that XSAVE also covers those. We don't have support for AVX512 in unikraft yet, though. One main reason is that I don't have a machine with AVX512 to test the code on. ;-) Anyways, it probably a good idea to add a compilation error in case __AVX__ is defined, but __SSE__ is not. I guess we could... However: these are flags set by the compiler based on your -mcpu / -mtune choices for CPU types, and there are no machines that have AVX but not SSE, so this seems like a very synthetic corner case where someone maliciously undefined __AVX__ and defined __SSE__ manually. + xsetbv +noavx: +noxsave: +#endif /* __AVX__ */ + /* Now, check for extended features. */ + movl $0x7, %eax + movl $0x1, %ecx + cpuid + /* ebx, ecx, edx now contain extended capabilties information. */ + /* check for and enable FS/GSBASE */ + testl $(X86_CPUID7_EBX_FSGSBASE), %ebx + jz nofsgsbase + orl $(X86_CR4_FSGSBASE), %edi + movq %rdi, %cr4 +nofsgsbase: + /* done setting up CPU capabilities *//* read multiboot info pointer */ -- Dr. Florian Schmidt フローリアン・シュミット Research Scientist, Systems and Machine Learning Group NEC Laboratories Europe Kurfürsten-Anlage 36, D-69115 Heidelberg Tel. +49 (0)6221 4342-265 Fax: +49 (0)6221 4342-155 e-mail: florian.schmidt@xxxxxxxxx ============================================================ Registered at Amtsgericht Mannheim, Germany, HRB728558 _______________________________________________ Minios-devel mailing list Minios-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/minios-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |