[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86: drop REX64_PREFIX
On 17.07.2024 15:05, Andrew Cooper wrote: > On 17/07/2024 1:40 pm, Jan Beulich wrote: >> While we didn't copy the full Linux commentary, Linux commit >> 7180d4fb8308 ("x86_64: Fix 64bit FXSAVE encoding") is quite explicit >> about gas 2.16 supporting FXSAVEQ / FXRSTORQ. As that's presently our >> minimal required version, we can drop the workaround that was needed for >> yet for older gas. >> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > > Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Thanks. > It's especially nice to get rid of the __sun__ block, although having > looked through this, ... > >> --- a/xen/arch/x86/i387.c >> +++ b/xen/arch/x86/i387.c >> @@ -64,13 +64,12 @@ static inline void fpu_fxrstor(struct vc >> { >> default: >> asm volatile ( >> - /* See below for why the operands/constraints are this way. */ >> - "1: " REX64_PREFIX "fxrstor (%2)\n" >> + "1: fxrstorq %0\n" >> ".section .fixup,\"ax\" \n" >> "2: push %%"__OP"ax \n" >> " push %%"__OP"cx \n" >> " push %%"__OP"di \n" >> - " mov %2,%%"__OP"di \n" >> + " lea %0,%%"__OP"di \n" >> " mov %1,%%ecx \n" >> " xor %%eax,%%eax \n" >> " rep ; stosl \n" >> @@ -81,7 +80,7 @@ static inline void fpu_fxrstor(struct vc >> ".previous \n" >> _ASM_EXTABLE(1b, 2b) > > ... the internals of the fixup section leave a lot to be desired. > > My build happens to have emitted lea (%rdi), %rdi for this. Yeah, that was supposed to be happening somewhere. I saw %rax and %rdx once each, and using LEA there is still kind of a waste. > A better option than opencoding memset() would be to simply return > -EIO/-EFAULT like we do from *msr_safe(), writing the error path in C, > and getting the system optimised memset() rather than using a form which > is definitely sub-optimal on all 64bit processors. I think the reason for having done this in assembly is to be able to wire back to the faulting instruction. On top of what you say we could do we'd then further need to put the whole thing in a loop, or add a 3rd FXSTOR. Which isn't to say that the overall result then isn't going to be neater. What I'm not concerned of with this fallback code is performance, though. That fixup path better wouldn't be taken anyway. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |