[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] x86/apicv: fix RTC periodic timer and apicv issue



Hi Chao,
To me, it is sufficient.. thanks for your verification!!


Quan

On December 22, 2016 4:02 AM, Chao Gao wrote:
>Hi, xuquan.
>I have tested it on my skylake server. W/o this patch the inaccurate wall
>clock time issue only exists in Win7-32 guest. Win7-64, Win8-32, Win8-64,
>Win10-32 ,Win10-64 and linux-4.8.0+ guests don't have this issue.
>W/ this v4 patch, the issue disappears in Win7-32 guest and no wall lock
>time related regression is found on Win7-64, Win8-32, Win8-64, Win10-32,
>Win10-64 and linux-4.8.0+ guest.
>
>In windows guest, the test procedure is
>1. Create a windows guest with 2 vCPU
>2. run the following .bat in guest
>    :abcd
>    echo 111111
>    goto abcd
>
>3. Start a stop-watch outside the guest and monitor the clock at the lower
>right
>   corner in guest. After 120 seconds according the guest clock, stop the
>stop-watch.
>   If the time shows in the stop-watch is about 120 seconds, then I think
>   there is no the above issue in the guest. Otherwise, the time is
>inaccurate.
>
>In Win7-32 case, the stop-watch time is about 70 seconds, so the clock in
>guest is obviously inaccurate.
>
>In linux guest, the test procedure is
>1. Create a linux guest with 4 vCPU
>2. insmod the following linux module
>   (through output of /proc/interrupt, about 850000 ipis in 13 seconds) 3.
>use date command to get guest time, others are same as test in windows
>guest
>
>#include <linux/init.h>
>#include <linux/module.h>
>#include <linux/kthread.h>
>#include <linux/sched.h>
>#include <asm/delay.h>
>MODULE_LICENSE("GPL");
>void workload(void *info)
>{
>        asm volatile("nop");
>}
>
>void msleep(unsigned int msecs);
>static int ipi_generator(void * info)
>{
>        int i;
>        while (!kthread_should_stop()) {
>                for(i=0; i< 5 * 10000; i++)
>                {
>                        smp_call_function(workload, NULL,1);
>                }
>                msleep(1);
>        }
>        return 0;
>}
>struct task_struct *thread;
>static int __init ipi_init(void)
>{
>        thread = kthread_run(ipi_generator, NULL, "IPI");
>        if (IS_ERR(thread))
>                return PTR_ERR(thread);
>        return 0;
>}
>
>static void __exit ipi_exit(void)
>{
>        kthread_stop(thread);
>}
>module_init(ipi_init);
>module_exit(ipi_exit);
>
>Are these tests sufficient? Please let me know if you have any other
>thoughts.
>
>On Wed, Dec 21, 2016 at 05:44:08AM +0000, Xuquan (Quan Xu) wrote:
>>When Xen apicv is enabled, wall clock time is faster on Windows7-32
>>guest with high payload (with 2vCPU, captured from xentrace, in high
>>payload, the count of IPI interrupt increases rapidly between these
>>vCPUs).
>>
>>If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector
>>0xd1) are both pending (index of bit set in vIRR), unfortunately, the
>>IPI intrrupt is high priority than periodic timer interrupt. Xen
>>updates IPI interrupt bit set in vIRR to guest interrupt status (RVI)
>>as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI
>>interrupt within VMX non-root operation without a VM-Exit. Within VMX
>>non-root operation, if periodic timer interrupt index of bit is set in
>>vIRR and highest, the apicv delivers periodic timer interrupt within
>>VMX non-root operation as well.
>>
>>But in current code, if Xen doesn't update periodic timer interrupt bit
>>set in vIRR to guest interrupt status (RVI) directly, Xen is not aware
>>of this case to decrease the count (pending_intr_nr) of pending
>>periodic timer interrupt, then Xen will deliver a periodic timer interrupt
>again.
>>
>>And that we update periodic timer interrupt in every VM-entry, there is
>>a chance that already-injected instance (before EOI-induced exit
>>happens) will incur another pending IRR setting if there is a VM-exit
>>happens between virtual interrupt injection (vIRR->0, vISR->1) and
>>EOI-induced exit (vISR->0), since pt_intr_post hasn't been invoked yet,
>>then the guest receives more periodic timer interrupt.
>>
>>So we set eoi_exit_bitmap for intack.vector when it's higher than
>>pending periodic time interrupts. This way we can guarantee there's
>>always a chance to post periodic time interrupts when periodic time
>>interrupts becomes the highest one.
>>
>>Signed-off-by: Quan Xu <xuquan8@xxxxxxxxxx>
>>---
>> xen/arch/x86/hvm/vmx/intr.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>>diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
>>index 639a705..0cf26b4 100644
>>--- a/xen/arch/x86/hvm/vmx/intr.c
>>+++ b/xen/arch/x86/hvm/vmx/intr.c
>>@@ -315,9 +315,13 @@ void vmx_intr_assist(void)
>>         * Set eoi_exit_bitmap for periodic timer interrup to cause
>EOI-induced VM
>>         * exit, then pending periodic time interrups have the chance to
>be injected
>>         * for compensation
>>+        * Set eoi_exit_bitmap for intack.vector when it's higher than
>pending
>>+        * periodic time interrupts. This way we can guarantee there's
>always a chance
>>+        * to post periodic time interrupts when periodic time
>interrupts becomes the
>>+        * highest one
>>         */
>>         if (pt_vector != -1)
>>-            vmx_set_eoi_exit_bitmap(v, pt_vector);
>>+            vmx_set_eoi_exit_bitmap(v, intack.vector);
>>
>>         /* we need update the RVI field */
>>         __vmread(GUEST_INTR_STATUS, &status); @@ -334,7 +338,8
>@@ void
>>vmx_intr_assist(void)
>>             __vmwrite(EOI_EXIT_BITMAP(i),
>v->arch.hvm_vmx.eoi_exit_bitmap[i]);
>>         }
>>
>>-        pt_intr_post(v, intack);
>>+        if ( intack.vector == pt_vector )
>>+            pt_intr_post(v, intack);
>>     }
>>     else
>>     {
>>--
>>1.8.3.4

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.