|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v14 for-xen-4.5 11/21] x86/VPMU: Interface for setting PMU mode and flags
>>> On 27.10.14 at 19:52, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 10/27/2014 12:24 PM, Jan Beulich wrote:
>>>>> On 17.10.14 at 23:17, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>> + mycpu = smp_processor_id();
>>> +
>>> + if ( sync_vcpu != NULL ) /* if set, we may be in hypercall
>>> continuation */
>>> + {
>>> + if (sync_vcpu != curr_vcpu )
>>> + /* We are not the original caller */
>>> + return -EAGAIN;
>> That would seem to be the wrong return value then. Also, the HAP
>> side fix for XSA-97 taught us that identifying a caller by vCPU is
>> problematic - in the course of the retries the kernel's scheduler
>> may move the calling process to a different vCPU, yet it's still the
>> legitimate original caller.
>
> If the process is rescheduled then we will time out operation.
And potentially never be able to complete it. Not an acceptable
model imo.
> I suppose I can set a bit in the argument's val to mark that particular
> argument as pending a continuation completion (I don't think we need to
> worry about malicious domain here since this is a privileged operation).
Privileged in the sense that it (conceptually) will always be restricted
to the hardware domain (or one granted equivalent privileges)?
Please don't forget about disaggregation - newly added code will not
get exceptions granted along the lines of XSA-77.
Also "I can set a bit in ..." is too vague to say whether that would end
up being an acceptable approach. The rationale behind the final
behavior we gave the XSA-97 fix was that if the operation is privileged
enough, it is okay for any vCPU of the originating domain to continue
the current one (including the non-determinism of which of them will
see the final successful completion of the hypercall, should more than
one of them race). I think you ought to follow that model here and
store/check the domain rather than the vCPU, in which case I don't
think you'll need any extra bit(s).
>>> + goto cont_wait;
>>> + }
>>> +
>>> + for_each_online_cpu ( i )
>>> + {
>>> + if ( i == mycpu )
>>> + continue;
>>> +
>>> + per_cpu(sync_task, i) = xmalloc(struct tasklet);
>>> + if ( per_cpu(sync_task, i) == NULL )
>>> + {
>>> + printk(XENLOG_WARNING "vpmu_force_context_switch: out of
>>> memory\n");
>>> + ret = -ENOMEM;
>>> + goto out;
>>> + }
>>> + tasklet_init(per_cpu(sync_task, i), vpmu_sched_checkin, 0);
>>> + }
>>> +
>>> + /* First count is for self */
>>> + atomic_set(&vpmu_sched_counter, 1);
>>> +
>>> + for_each_online_cpu ( i )
>>> + {
>>> + if ( i != mycpu )
>>> + tasklet_schedule_on_cpu(per_cpu(sync_task, i), i);
>>> + }
>>> +
>>> + vpmu_save(current);
>>> +
>>> + sync_vcpu = curr_vcpu;
>>> + start = NOW();
>>> +
>>> + cont_wait:
>>> + /*
>>> + * Note that we may fail here if a CPU is hot-plugged while we are
>>> + * waiting. We will then time out.
>>> + */
>> And I continue to miss the handling of the hot-unplug case (or at the
>> very least a note on this being unimplemented [and going to crash],
>> to at least clarify matters to the curious reader).
>
> Where would we crash? I have no interest in that.
per_cpu() accesses are invalid for offline CPUs.
>>> + while ( atomic_read(&vpmu_sched_counter) != numcpus )
>>> + {
>>> + s_time_t now;
>>> +
>>> + cpu_relax();
>>> +
>>> + now = NOW();
>>> +
>>> + /* Give up after (arbitrarily chosen) 5 seconds */
>>> + if ( now > start + SECONDS(5) )
>>> + {
>>> + printk(XENLOG_WARNING
>>> + "vpmu_force_context_switch: failed to sync\n");
>>> + ret = -EBUSY;
>>> + break;
>>> + }
>>> +
>>> + if ( hypercall_preempt_check() )
>>> + return hypercall_create_continuation(
>>> + __HYPERVISOR_xenpmu_op, "i", XENPMU_mode_set);
>> Did you test this code path? I don't see how with the missing second
>> hypercall argument the continuation could reliably succeed.
>
> I did test it (and retested it now) and it works. I guess it may be
> picking the same value from the stack during continuation which is why
> it does not fail.
Oh, right, hypercall argument clobbering (in debug builds) gets
skipped for continuations (and no clobbering is being done for
HVM/PVH at all). But I don't think you should rely on this, i.e. the
invocation above should get fixed in any event.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |