[xen master] x86/hap: be more selective with assisted TLB flush

commit 17b997aa1edb9eb8d9bd1958457ff50927f46832
Author:     Roger Pau Monné <roger.pau@xxxxxxxxxx>
AuthorDate: Mon May 4 11:53:01 2020 +0200
Commit:     Jan Beulich <jbeulich@xxxxxxxx>
CommitDate: Mon May 4 11:53:01 2020 +0200

    x86/hap: be more selective with assisted TLB flush
    When doing an assisted flush on HAP the purpose of the
    on_selected_cpus is just to trigger a vmexit on remote CPUs that are
    in guest context, and hence just using is_vcpu_dirty_cpu is too lax,
    also check that the vCPU is running. Due to the lazy context switching
    done by Xen dirty_cpu won't always be cleared when the guest vCPU is
    not running, and hence relying on is_running allows more fine grained
    control of whether the vCPU is actually running.
    I've measured the time of the non-local branch of flush_area_mask
    inside the shim running with 32vCPUs over 100000 executions and
    averaged the result on a large Westmere system (80 ways total). The
    figures where fetched during the boot of a SLES 11 PV guest. The
    results are as follow (less is better):
    Non assisted flush with x2APIC:      112406ns
    Assisted flush without this patch:   820450ns
    Assisted flush with this patch:        8330ns
    While there also pass NULL as the data parameter of on_selected_cpus,
    the dummy handler doesn't consume the data in any way.
    Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
    Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
 xen/arch/x86/mm/hap/hap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 580d1c2164..0275cdf5c8 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -719,7 +719,7 @@ static bool flush_tlb(bool (*flush_vcpu)(void *ctxt, struct 
vcpu *v),
         cpu = read_atomic(&v->dirty_cpu);
-        if ( cpu != this_cpu && is_vcpu_dirty_cpu(cpu) )
+        if ( cpu != this_cpu && is_vcpu_dirty_cpu(cpu) && v->is_running )
             __cpumask_set_cpu(cpu, mask);
@@ -729,7 +729,7 @@ static bool flush_tlb(bool (*flush_vcpu)(void *ctxt, struct 
vcpu *v),
      * not currently running will already be flushed when scheduled because of
      * the ASID tickle done in the loop above.
-    on_selected_cpus(mask, dummy_flush, mask, 0);
+    on_selected_cpus(mask, dummy_flush, NULL, 0);
     return true;
generated by git-patchbot for /home/xen/git/xen.git#master



