Xen project Mailing List

Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

From: "Xuquan (Euler)" <xuquan8@xxxxxxxxxx>

Date: Fri, 9 Sep 2016 03:02:10 +0000

Accept-language: en-US

Cc: "yang.zhang.wz@xxxxxxxxx" <yang.zhang.wz@xxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "George.Dunlap@xxxxxxxxxxxxx" <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>

Delivery-date: Fri, 09 Sep 2016 03:02:36 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHSAoOI0gJ1eNYjIUW6VScgIQM/5KBwcbzQ

Thread-topic: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue

On August 30, 2016 1:58 PM, Tian Kevin < kevin.tian@xxxxxxxxx > wrote: >> From: Xuquan (Euler) [mailto:xuquan8@xxxxxxxxxx] >> Sent: Friday, August 19, 2016 8:59 PM >> >> When Xen apicv is enabled, wall clock time is faster on Windows7-32 >> guest with high payload (with 2vCPU, captured from xentrace, in high >> payload, the count of IPI interrupt increases rapidly between these vCPUs). >> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the >> IPI intrrupt is high priority than periodic timer interrupt. Xen >> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) >> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI >> interrupt within VMX non-root operation without a VM exit. Within VMX >> non-root operation, if periodic timer interrupt index of bit is set in >> VIRR and highest, the apicv delivers periodic timer interrupt within VMX >non-root operation as well. >> >> But in current code, if Xen doesn't update periodic timer interrupt >> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not >> aware of this case to decrease the count (pending_intr_nr) of pending >> periodic timer interrupt, then Xen will deliver a periodic timer >> interrupt again. The guest receives more periodic timer interrupt. >> >> If the periodic timer interrut is delivered and not the highest >> priority, make Xen be aware of this case to decrease the count of >> pending periodic timer interrupt. >> >> Signed-off-by: Yifei Jiang <jiangyifei@xxxxxxxxxx> >> Signed-off-by: Rongguang He <herongguang.he@xxxxxxxxxx> >> Signed-off-by: Quan Xu <xuquan8@xxxxxxxxxx> >> --- >> Why RFC: >> 1. I am not quite sure for other cases, such as nested case. >> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including >external >> interrupts, non-maskable interrupt system-management interrrupts, >exceptions >> and VM exit) may occur before delivery of a periodic timer interrupt, the >periodic >> timer interrupt may be lost when a coming periodic timer interrupt is >delivered. >> Actually, and so current code is. >> --- >> xen/arch/x86/hvm/vmx/intr.c | 16 +++++++++++++++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c >> index 8fca08c..d3a034e 100644 >> --- a/xen/arch/x86/hvm/vmx/intr.c >> +++ b/xen/arch/x86/hvm/vmx/intr.c >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) >> __vmwrite(EOI_EXIT_BITMAP(i), >v->arch.hvm_vmx.eoi_exit_bitmap[i]); >> } >> >> - pt_intr_post(v, intack); >> + /* >> + * If the periodic timer interrut is delivered and not the highest >priority, >> + * make Xen be aware of this case to decrease the count of pending >periodic >> + * timer interrupt. >> + */ >> + if ( pt_vector != -1 && intack.vector > pt_vector ) >> + { >> + struct hvm_intack pt_intack; >> + >> + pt_intack.vector = pt_vector; >> + pt_intack.source = hvm_intsrc_lapic; >> + pt_intr_post(v, pt_intack); >> + } >> + else >> + pt_intr_post(v, intack); >> } > >Here is my thought. w/o above patch one pending pt interrupt may result in >multiple injections of guest timer interrupt, as you identified, when pt_vector >happens to be not the highest pending vector. Then w/ your patch, multiple >pending pt interrupt instances may be combined as one injection of guest timer >interrupt. Especially intack.vector >>pt_vector doesn't mean pt_intr is eligible for injection, which might >be blocked due to other reasons. As Jan pointed out, this path is very >fragile, so >better we can have a fix to sustain the original behavior with one pending pt >instance causing one actual injection. > >What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit for >pt_intr is already set to 1 which means every injection would incur an >EOI-induced VM-exit: > > /* > * Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced >VM > * exit, then pending periodic time interrups have the chance to be >injected > * for compensation > */ > if (pt_vector != -1) > vmx_set_eoi_exit_bitmap(v, pt_vector); > >I don't think delaying post makes much difference here. Even we do post earlier >like your patch, further pending instances have to wait until current instance >is >completed. So as long as post happens before EOI is completed, it should be >fine. Kevin, I verify your suggestion with below modification. wall clock time is __still__ faster. I hope this modification is correct to your suggestion. I __guess__ that the vm-entry is more frequent than PT interrupt, Specifically, if there is a PT interrupt pending, the count (pending_intr_nr) is 1.. before next PT interrupt coming, each PT interrupt injection may not incur an EOI-induced VM-exit directly, multiple PT interrupt inject for a pending PT interrupt, then multiple EOI-induced VM-exit incur with multiple pt_intr_post() to decrease the count (pending_intr_nr), but the count (pending_intr_nr) is still 1 before next PT interrupt coming, then only one pt_intr_post() is effective.. Thanks for your time!! Quan ======= modification ======== diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index 1d5d287..cc247c3 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic) void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) { struct domain *d = vlapic_domain(vlapic); + struct hvm_intack pt_intack; + + pt_intack.vector = vector; + pt_intack.source = hvm_intsrc_lapic; + pt_intr_post(vlapic_vcpu(vlapic), pt_intack); if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) ) vioapic_update_EOI(d, vector); diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index 8fca08c..29d9bbf 100644 --- a/xen/arch/x86/hvm/vmx/intr.c +++ b/xen/arch/x86/hvm/vmx/intr.c @@ -333,8 +333,6 @@ void vmx_intr_assist(void) clear_bit(i, &v->arch.hvm_vmx.eoi_exitmap_changed); __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]); } - - pt_intr_post(v, intack); } else { _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.