[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue

On August 30, 2016 1:58 PM, Tian Kevin < kevin.tian@xxxxxxxxx > wrote:
>> From: Xuquan (Euler) [mailto:xuquan8@xxxxxxxxxx]
>> Sent: Friday, August 19, 2016 8:59 PM
>> When Xen apicv is enabled, wall clock time is faster on Windows7-32
>> guest with high payload (with 2vCPU, captured from xentrace, in high
>> payload, the count of IPI interrupt increases rapidly between these vCPUs).
>> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector
>> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the
>> IPI intrrupt is high priority than periodic timer interrupt. Xen
>> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI)
>> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI
>> interrupt within VMX non-root operation without a VM exit. Within VMX
>> non-root operation, if periodic timer interrupt index of bit is set in
>> VIRR and highest, the apicv delivers periodic timer interrupt within VMX
>non-root operation as well.
>> But in current code, if Xen doesn't update periodic timer interrupt
>> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not
>> aware of this case to decrease the count (pending_intr_nr) of pending
>> periodic timer interrupt, then Xen will deliver a periodic timer
>> interrupt again. The guest receives more periodic timer interrupt.
>> If the periodic timer interrut is delivered and not the highest
>> priority, make Xen be aware of this case to decrease the count of
>> pending periodic timer interrupt.
>> Signed-off-by: Yifei Jiang <jiangyifei@xxxxxxxxxx>
>> Signed-off-by: Rongguang He <herongguang.he@xxxxxxxxxx>
>> Signed-off-by: Quan Xu <xuquan8@xxxxxxxxxx>
>> ---
>> Why RFC:
>> 1. I am not quite sure for other cases, such as nested case.
>> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including
>>    interrupts, non-maskable interrupt system-management interrrupts,
>>    and VM exit) may occur before delivery of a periodic timer interrupt, the
>>    timer interrupt may be lost when a coming periodic timer interrupt is
>>    Actually, and so current code is.
>> ---
>>  xen/arch/x86/hvm/vmx/intr.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
>> index 8fca08c..d3a034e 100644
>> --- a/xen/arch/x86/hvm/vmx/intr.c
>> +++ b/xen/arch/x86/hvm/vmx/intr.c
>> @@ -334,7 +334,21 @@ void vmx_intr_assist(void)
>>              __vmwrite(EOI_EXIT_BITMAP(i),
>>          }
>> -        pt_intr_post(v, intack);
>> +       /*
>> +        * If the periodic timer interrut is delivered and not the highest
>> +        * make Xen be aware of this case to decrease the count of pending
>> +        * timer interrupt.
>> +        */
>> +        if ( pt_vector != -1 && intack.vector > pt_vector )
>> +        {
>> +            struct hvm_intack pt_intack;
>> +
>> +            pt_intack.vector = pt_vector;
>> +            pt_intack.source = hvm_intsrc_lapic;
>> +            pt_intr_post(v, pt_intack);
>> +        }
>> +        else
>> +            pt_intr_post(v, intack);
>>      }
>Here is my thought. w/o above patch one pending pt interrupt may result in
>multiple injections of guest timer interrupt, as you identified, when pt_vector
>happens to be not the highest pending vector. Then w/ your patch, multiple
>pending pt interrupt instances may be combined as one injection of guest timer
>interrupt. Especially intack.vector
>>pt_vector doesn't mean pt_intr is eligible for injection, which might
>be blocked due to other reasons. As Jan pointed out, this path is very 
>fragile, so
>better we can have a fix to sustain the original behavior with one pending pt
>instance causing one actual injection.
>What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit for
>pt_intr is already set to 1 which means every injection would incur an
>EOI-induced VM-exit:
>       /*
>        * Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced
>        * exit, then pending periodic time interrups have the chance to be
>        * for compensation
>        */
>        if (pt_vector != -1)
>            vmx_set_eoi_exit_bitmap(v, pt_vector);
>I don't think delaying post makes much difference here. Even we do post earlier
>like your patch, further pending instances have to wait until current instance 
>completed. So as long as post happens before EOI is completed, it should be

Kevin, I verify your suggestion with below modification. wall clock time is 
__still__ faster. I hope this modification is correct to your suggestion.

I __guess__ that the vm-entry is more frequent than PT interrupt,
Specifically, if there is a PT interrupt pending, the count (pending_intr_nr) 
is 1..

before next PT interrupt coming, each PT interrupt injection may not incur an 
EOI-induced VM-exit directly, multiple PT interrupt inject for a pending PT 
then multiple EOI-induced VM-exit incur with multiple pt_intr_post() to 
decrease the count (pending_intr_nr), but the count (pending_intr_nr) is still 
1 before next PT interrupt coming, then only one pt_intr_post() is effective..

Thanks for your time!!

======= modification ========

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 1d5d287..cc247c3 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic)
 void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
     struct domain *d = vlapic_domain(vlapic);
+    struct hvm_intack pt_intack;
+    pt_intack.vector = vector;
+    pt_intack.source = hvm_intsrc_lapic;
+    pt_intr_post(vlapic_vcpu(vlapic), pt_intack);

     if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) )
         vioapic_update_EOI(d, vector);
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 8fca08c..29d9bbf 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -333,8 +333,6 @@ void vmx_intr_assist(void)
             clear_bit(i, &v->arch.hvm_vmx.eoi_exitmap_changed);
             __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]);
-        pt_intr_post(v, intack);

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.