[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/4] SVM: streamline entry.S code
>>> On 26.08.13 at 23:47, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > On 23/08/2013 15:04, Jan Beulich wrote: >> --- a/xen/arch/x86/hvm/svm/entry.S >> +++ b/xen/arch/x86/hvm/svm/entry.S >> @@ -32,28 +32,34 @@ >> #define CLGI .byte 0x0F,0x01,0xDD >> >> ENTRY(svm_asm_do_resume) >> + GET_CURRENT(%rbx) >> +.Lsvm_do_resume: >> call svm_intr_assist >> mov %rsp,%rdi >> call nsvm_vcpu_switch >> ASSERT_NOT_IN_ATOMIC >> >> - GET_CURRENT(%rbx) >> - CLGI >> - >> mov VCPU_processor(%rbx),%eax >> - shl $IRQSTAT_shift,%eax >> lea irq_stat+IRQSTAT_softirq_pending(%rip),%rdx >> - cmpl $0,(%rdx,%rax,1) >> + xor %ecx,%ecx >> + shl $IRQSTAT_shift,%eax >> + CLGI >> + cmp %ecx,(%rdx,%rax,1) >> jne .Lsvm_process_softirqs >> >> - testb $0, VCPU_nsvm_hap_enabled(%rbx) >> -UNLIKELY_START(nz, nsvm_hap) > > Isn't this as much a bugfix as an optimisation? test $0, <anything> > should unconditionally set the zero flag, making the UNLIKELY_START() > unreachable code. Indeed, I shall add a note about this to the description. >> call svm_asid_handle_vmrun >> > > The next instruction after this hunk is "cmpb $0,tl_init_done(%rip)", > where %cl could be used, in the same style of changes as above. Yeah, but the call clobbers %rcx. >> @@ -72,13 +78,12 @@ UNLIKELY_END(svm_trace) >> mov UREGS_eflags(%rsp),%rax >> mov %rax,VMCB_rflags(%rcx) > > This mov could be moved down as well, which will break the data hazard > on %rax. It might however want a comment with it, as that would > certainly make the "mov UREGS_eflags(%rsp),%rax; mov > %rax,VMCB_rflags(%rcx)" pair less obvious. That would be applicable to the whole series of moves here. However, I don't think there's a dependency issue here on modern CPUs (sadly neither Intel's nor AMD's optimization docs go into enough detail about this), largely thanks to register renaming. >> @@ -92,25 +97,26 @@ UNLIKELY_END(svm_trace) >> >> VMRUN >> >> + GET_CURRENT(%rax) >> push %rdi >> push %rsi >> push %rdx >> push %rcx >> + mov VCPU_svm_vmcb(%rax),%rcx >> push %rax > > As regs->rax is overwritten just below, this could be replaced with "sub > $8,%rsp" which mirrors the the add before VMRUN, and would also help to > break up the queue of pushes in the pipeline. But whether the CPU can better handle successive pushes or pushes (producing and consuming %rsp) interleaved with an ALU adjustment to %rsp is not clear. > All of those points were just nitpicking, and the changes are mainly > along a similar line to the VMX. I still have the same concern about > jumping back to the top of the function, but I shall investigate that > and start a new thread. > > Therefore, > Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Thanks. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |