[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] performace issue when turn on apicv



I will setup a test environment for xen upstream version but need some time.
Meanwhile, can you give me some idea about what MAY cause this problem?

I have been using xentrace to trace the problem, from what I see, the apicv 
feature
itself works

apicv=1
             2583096 VMEXIT          19027420184 TSC WRMSR
              459708 VMEXIT           6924749392 TSC External interrupt
              293818 VMEXIT            451974088 TSC Virtualized EOI
                 843 VMEXIT             54729244 TSC I/O instruction
                3260 VMEXIT             15979024 TSC Control-register accesses
                1345 VMEXIT              2199736 TSC Exception or non-maskable 
interrupt (NMI)
                  39 VMEXIT              1516768 TSC EPT violation
                  54 VMEXIT               891712 TSC VMCALL
                 205 VMEXIT               370864 TSC CPUID

apicv=0
             3416159 VMEXIT          20929093044 TSC WRMSR
             1098428 VMEXIT          11029334704 TSC External interrupt
               41128 VMEXIT             64360924 TSC Interrupt window
                 664 VMEXIT             49245372 TSC I/O instruction
                3221 VMEXIT             20116036 TSC Control-register accesses
                1401 VMEXIT              2280412 TSC Exception or non-maskable 
interrupt (NMI)
                  39 VMEXIT              1581428 TSC EPT violation
                  53 VMEXIT               749588 TSC VMCALL
                 205 VMEXIT               355500 TSC CPUID
                 113 VMEXIT               298568 TSC RDMSR

RDMSR gone,so "APIC Register Virtualization" works

apicv=1
IRQ                  IRQ_MOVE_CLEANUP_VECTOR(  32):                    21
IRQ                       LOCAL_TIMER_VECTOR( 249):                423804
IRQ                     CALL_FUNCTION_VECTOR( 251):                  1171
IRQ                       EVENT_CHECK_VECTOR( 252):                  1130
IRQ                    INVALIDATE_TLB_VECTOR( 253):                     1
apicv=0
IRQ                  IRQ_MOVE_CLEANUP_VECTOR(  32):                    22
IRQ                      LAST_DYNAMIC_VECTOR( 223):                    27
IRQ                       LOCAL_TIMER_VECTOR( 249):                448057
IRQ                     CALL_FUNCTION_VECTOR( 251):                  1173
IRQ                       EVENT_CHECK_VECTOR( 252):                608024

vmexit caused by External interrupt: EVENT_CHECK_VECTOR reduced a lot,
so "Virtual Interrupt Delivery" and "Posted Interrupt Processing" works, I 
guess.

I think the problem is not caused by apicv itself, maybe some other logic
has conflict with apicv.

On 2015/6/11 15:35, Zhang, Yang Z wrote:
Liuqiming (John) wrote on 2015-06-11:
> Hi,
>
> Recently I encounter a strange performance problem with APIC virtualization.
>
> My host has Intel(R) Xeon(R) CPU E7-4890 v2 CPU installed which support
> APIC virtualization and x2apic, and there are 4 socket * 15
> cores_per_socket = 60 core available for VM. There is also a SSD disk on
> host and the host support vt-d, so I can passsthrough this SSD to VM.
>
> A VM created with 60 vcpus,  400G memory and SSD device assigned.
> I pin these vcpus 1:1 to phisical cpu, and in this VM only keep 15 vcpus
>    online.
> The problem is: when apicv turn on, a significant performace decrease
> can be observed and it seems related to cpu topology.
>
> I had test follow cases
> apicv=1:
> 1)  ONLINE VCPU {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}
>           PIN TO
>              PCPU {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}
> 2)  ONLINE VCPU {0,5,9,13,17,21,25,29,33,37,41,45,49,53,57}
>           PIN TO
>              PCPU {0,5,9,13,17,21,25,29,33,37,41,45,49,53,57}
> apicv=0:
> 3)  ONLINE VCPU {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}
>           PIN TO
>              PCPU {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}
> 4)  ONLINE VCPU {0,5,9,13,17,21,25,29,33,37,41,45,49,53,57}
>           PIN TO
>              PCPU {0,5,9,13,17,21,25,29,33,37,41,45,49,53,57}
> the result is (the lower the better):
> 1) 989
> 2) 2209
> 3) 978
> 4) 1130
>
> It is a database testcase running on suse11sp3 system in the VM, and I
> had traced that "read" and "write" syscall get much slower in 2) case.
>
> I have disabled NUMA in BIOS, so it seems apicv cause this bad
> performance when using cpus in different nodes.
>
> Can any one shed some light on this?
>
> Btw, I am using xen 4.1.5 version with apicv backported, so I am not
> sure whether something broken when backporting or just apicv behaves
> this way.

Can you retest it based on upstream Xen? Just as you suspected, your 
backporting may be the culprit.

>
>
>


Best regards,
Yang





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.