[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v15 11/21] x86/VPMU: Interface for setting PMU mode and flags



>>> On 26.11.14 at 15:26, <boris.ostrovsky@xxxxxxxxxx> wrote:

> On 11/25/2014 08:36 AM, Jan Beulich wrote:
>>
>>> +static long vpmu_sched_checkin(void *arg)
>>> +{
>>> +    int cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
>> unsigned int.
>>
>>> +    int ret;
>>> +
>>> +    /* Mode change shouldn't take more than a few (say, 5) seconds. */
>>> +    if ( NOW() > vpmu_start_ctxt_switch + SECONDS(5) )
>>> +    {
>>> +        ret = -EBUSY;
>>> +        goto fail;
>>> +    }
>> So what does this guard against? Holding xenpmu_mode_lock for
>> 5 seconds is not a problem, plus I don't think you actually need a
>> lock at all. Just using a global, atomically updated flag ought to be
>> sufficient (the way you use the lock is really nothing else afaict).
> 
> I didn't put it here because of the lock. I just wanted to terminate 
> this operation (mode change) if it takes unreasonable amount of time. 
> And I thought 5 seconds would be unreasonable.

But the question you need to ask yourself is _why_ it could be
taking that long. Since all you do is repeatedly scheduling a
tasklet, it taking arbitrarily long can only be the effect of (a) a
huge system (extremely many CPUs) or (b) a hypervisor bug.
Neither of which is a reason for an arbitrary timeout like you put
in.

>>> +{
>>> +    int ret;
>>> +
>>> +    vpmu_start_ctxt_switch = NOW();
>>> +
>>> +    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
>>> +                                    vpmu_sched_checkin, (void *)old_mode);
>>> +    if ( ret )
>>> +        vpmu_start_ctxt_switch = 0;
>>> +
>>> +    return ret;
>>> +}
>> I don't think it is guaranteed (independent of implementation details
>> of continue_hypercall_on_cpu()) that this way you went through the
>> context switch path once on each CPU if
>> cpumask_first(&cpu_online_map) == smp_processor_id() here. In
>> particular it looks like there being a problem calling vcpu_sleep_sync()
>> in the tasklet handler when v == current (which would be the case
>> if you started on the "correct" CPU and the tasklet softirq got
>> handled before the scheduler one). I think you instead want to use
>> cpumask_cycle() here and above, and finish the whole operation
>> once you're back on the CPU you started on (the single-pCPU case
>> may need some extra consideration).
> 
> So your concern is only because of initiating CPU? I can do a 
> force_save() on it so I don't really need a context switch here. This 
> will take care of single PCPU too.

Good.

>> I realize that you simply follow what microcode_update() does, but
>> I think we should rather correct that one than copy the latent issue
>> it has elsewhere.
>>
>>> +long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
>>> +{
>>> +    int ret;
>>> +    struct xen_pmu_params pmu_params;
>>> +
>>> +    ret = xsm_pmu_op(XSM_OTHER, current->domain, op);
>>> +    if ( ret )
>>> +        return ret;
>>> +
>>> +    switch ( op )
>>> +    {
>>> +    case XENPMU_mode_set:
>>> +    {
>>> +        uint64_t old_mode;
>> Irrespective of the earlier comments I don't see why this isn't just
>> unsigned int (or typeof(vpmu_mode)).
> 
> This was because vpmu_force_context switch() takes uint64_t as an 
> argument. Now that it will become unsigned long this should too, no? 
> Yes, the compiler will promote it if it is an int but I thought this 
> would look cleaner.

Imo "cleaner" is when the variable is typed according to the value(s)
it is to hold, not according to the parameter types of functions it's
going to be passed to. And with operations on 32-bit types
(statistically) producing slightly smaller code, one could also argue
for using the smaller of the maximum width data source and the
maximum width consumer.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.