[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v14 for-xen-4.5 11/21] x86/VPMU: Interface for setting PMU mode and flags

On 10/29/2014 04:14 AM, Jan Beulich wrote:
On 28.10.14 at 17:56, <boris.ostrovsky@xxxxxxxxxx> wrote:
On 10/28/2014 04:29 AM, Jan Beulich wrote:
Also "I can set a bit in ..." is too vague to say whether that would end
up being an acceptable approach. The rationale behind the final
behavior we gave the XSA-97 fix was that if the operation is privileged
enough, it is okay for any vCPU of the originating domain to continue
the current one (including the non-determinism of which of them will
see the final successful completion of the hypercall, should more than
one of them race). I think you ought to follow that model here and
store/check the domain rather than the vCPU, in which case I don't
think you'll need any extra bit(s).
I am not sure just keeping domainID is sufficient in this case. True, it
doesn't matter which VCPU completes the operation but what we want to
avoid is to have two simultaneous (and possibly different) requests from
the same domain. If we keep it as some sort of a static variable (like I
do now with sync_vcpu) then it will be difficult to distinguish which
request is the continuation and which is a new one.
A static variable may indeed be insufficient here. Did you look at
the XSA-97 change at all, trying to mirror its logic here?

You mean storing this in domain structure? I don't want to add new fields to such a common structure for an operation that is exceedingly inferquent.

What I was suggesting is keeping some sort of state in the hypercall
argument making it unique to the call. I said "a bit" but it can be, for
example, setting the pad value in xen_pmu_params to some cookie
(although that's probably not a particularly good idea since then the
caller/domain would have to clear it before making the hypercall). So,
if we set, say, the upper bit in xen_pmu_params.val before creating
continuation then when we come back we will know for sure that this is
indeed the continuation and not a new call.
Whatever state in the hypercall arguments you alter, a malicious or
buggy caller could do the same to an original request.

However, I wonder whether a model without continuations (and
hence not along the lines of what we did for XSA-97) might not be
better here after all:

1) Considering that you don't need access to the hypercall
arguments after initial evaluation, continue_hypercall_on_cpu()
would seem usable here: Once you visited all CPUs, you can be
certain a context switch occurred everywhere.

2) You could pause the current vCPU after scheduling all tasklets
and have the last one unpause it and do the necessary cleanup.

This sounds simpler than what I have now.

I don't think I will need the tasklets with this approach: they are all part of continue_hypercall_on_cpu()?

As for pausing the VCPU? Won't the continue_hypercall_on_cpu() keep it asleep until everyone has completed?

+ cont_wait:
+    /*
+     * Note that we may fail here if a CPU is hot-plugged while we are
+     * waiting. We will then time out.
+     */
And I continue to miss the handling of the hot-unplug case (or at the
very least a note on this being unimplemented [and going to crash],
to at least clarify matters to the curious reader).
Where would we crash? I have no interest in that.
per_cpu() accesses are invalid for offline CPUs.

How about I get/put_cpu_maps() to prevent hotplug/unplug while we are
doing this?
That's more the last resort solution. I'd prefer if you made your loops
simply a little more careful. Remember that hot-unplug can't happen
while your code is executing, it can only hit while you are awaiting a
continuation to occur.

I didn't realize that. But let me try what you suggested above.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.