[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 for-xen-4.5 16/20] x86/VPMU: Handle PMU interrupts for PV guests

On 09/30/2014 11:44 AM, Jan Beulich wrote:

+static struct vcpu *choose_hwdom_vcpu(void)
+    struct vcpu *v;
+    unsigned idx = smp_processor_id() % hardware_domain->max_vcpus;
+    if ( hardware_domain->vcpu == NULL )
+        return NULL;
+    v = hardware_domain->vcpu[idx];
+    /*
+     * If index is not populated search downwards the vcpu array until
+     * a valid vcpu can be found
+     */
+    while ( !v && idx-- )
+        v = hardware_domain->vcpu[idx];
Each time I get here I wonder what case this is good for.
I thought we can have a case when first hardware_domain->vcpu[idx] is
NULL so we walk the array down until we find the first non-NULL vcpu.
Hot unplug, for example (we may be calling this from NMI context).
Hot unplug of a vCPU is a guest thing - this doesn't destroy the
vCPU in the hypervisor.

OK, I don't need this loop then.

   int vpmu_do_interrupt(struct cpu_user_regs *regs)
-    struct vcpu *v = current;
-    struct vpmu_struct *vpmu = vcpu_vpmu(v);
+    struct vcpu *sampled = current, *sampling;
+    struct vpmu_struct *vpmu;
+    /* dom0 will handle interrupt for special domains (e.g. idle domain) */
+    if ( sampled->domain->domain_id >= DOMID_FIRST_RESERVED )
+    {
+        sampling = choose_hwdom_vcpu();
+        if ( !sampling )
+            return 0;
+    }
+    else
+        sampling = sampled;
+    vpmu = vcpu_vpmu(sampling);
+    if ( !is_hvm_domain(sampling->domain) )
+    {
+        /* PV(H) guest */
+        const struct cpu_user_regs *cur_regs;
+        if ( !vpmu->xenpmu_data )
+            return 0;
+        if ( vpmu->xenpmu_data->pmu_flags & PMU_CACHED )
+            return 1;
+        if ( is_pvh_domain(sampled->domain) &&
Here and below - is this really the right condition? I.e. is the
opposite case (doing nothing here, but the one further down
having an else) really meant to cover both HVM and PV? The outer
!is_hvm_() doesn't count here as that acts on sampling, not
This is test for an error in do_interrupt() --- if it reported a failure
then there is no reason to proceed further.
That's not the question. Why is this done only for PVH?

This should be sampling, i.e. the guest who is managing the HW PMU MSR. Not sampled.

+            {
+                r->cs = cur_regs->cs;
+                if ( sampled->arch.flags & TF_kernel_mode )
+                    r->cs &= ~3;
And once again I wonder how the consumer of this data is to tell
apart guest kernel and hypervisor addresses.
Based on the RIP --- perf, for example, searches through various symbol
That doesn't help when profiling HVM/PVH guests - addresses are
ambiguous in that case.

Hypervisor traces are only sent to dom0, which is currently PV only. The key here, of course, is the word 'currently'.

I suppose I can set xenpmu_data->domain_id below to either DOMID_SELF
for guest and DOMID_XEN for the hypervisor.
That's an option, but I'm really having reservations against simulating
ring-0 execution in PV guests here. It would certainly be better if we
could report reality here, but I can see reservations on the consumer
(perf) side against us doing so.

Yes, perf will probably not like it --- as I mentioned in an earlier message, it calls user_mode(regs) which is essentially !!(regs->cs & 3).


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.