[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 08/24] x86: refactor psr: set value: implement framework.



On 17-03-13 06:35:33, Jan Beulich wrote:
> >>> On 13.03.17 at 03:36, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > On 17-03-10 02:09:55, Jan Beulich wrote:
> >> >>> On 10.03.17 at 03:54, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> >> > On 17-03-08 09:07:10, Jan Beulich wrote:
> >> >> >>> On 15.02.17 at 09:49, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> >> >> > +    ref[old_cos]--;
> >> >> > +    spin_unlock(&info->ref_lock);
> >> >> > +
> >> >> > +    /*
> >> >> > +     * Step 6:
> >> >> > +     * Save the COS ID into current domain's psr_cos_ids[] so that 
> >> >> > we can know
> >> >> > +     * which COS the domain is using on the socket. One domain can 
> >> >> > only use
> >> >> > +     * one COS ID at same time on each socket.
> >> >> > +     */
> >> >> > +    d->arch.psr_cos_ids[socket] = cos;
> >> >> 
> >> >> So the domain has not been paused, i.e. some of its vCPU-s may
> >> >> be running on other pCPU-s (including ones on the socket in
> >> >> question). How come it is safe to update this value here?
> >> >> 
> >> > This is a domctl operation. It is protected by domctl_lock which is 
> >> > locked in
> >> > do_domctl().
> >> 
> >> But that lock doesn't keep the subject domain's vCPU-s from
> >> running on other pCPU-s at the same time.
> >> 
> > Yes, you are right. But only 'psr_ctxt_switch_to()' can access
> > psr_cos_ids[socket] when psr_cos_ids[socket] is being set. 
> > 'psr_ctxt_switch_to()'
> > read the cos and set it to ASSOC register. Context switch is short so that 
> > the
> > correct cos can be set to ASSOC register in a short time.
> 
> That's a reply you should never give: No matter how short a time
> window, eventually it'll be hit. A very good example of this is the
> VMCS race we've recently fixed (commit 2f4d2198a9), and which
> had been there for years until it was first observed (and after
> that it took another year or so until we've actually managed to
> figure out what's going on).
> 
Sorry for that. Let me explain in details.

There are three scenarios. E.g.
1. User calls domctl interface on Dom0 to set a COS ID 1 for Dom1 into its
   psr_cos_ids[]. Then, Dom1 is scheduled so that 'psr_ctxt_switch_to()' is
   called which makes COS ID 1 work. For this case, we do not any action.

2. Dom1 runs on CPU 1 and COS ID 1 is working. At same time, user calls domctl
   interface on Dom0 to set a new COS ID 2 for Dom1 into psr_cos_ids[]. After
   time slice ends, the Dom1 is scheduled again, the new COS ID 2 will work.
   For this case, we don't need any action either.

3. When a new COS ID is being set to psr_cos_ids[], 'psr_ctxt_switch_to()'
   is called to access the same psr_cos_ids[] member through 'psr_assoc_cos'.
   The COS ID is constrained by cos_mask so that it cannot exceeds the cos_max.
   So even the COS ID got here is wrong, it is still a workable ID (within
   cos_max). The functionality is still workable but of course the COS ID may
   not be the one that user intends to use.

If you think scenario 3 is not acceptable, I suggest to add read write lock as
below. How do you think? Thanks!

static void psr_assoc_cos()
{
    read_lock(&rwlock);
    *reg = (*reg & ~cos_mask) |
            (((uint64_t)cos << PSR_ASSOC_REG_SHIFT) & cos_mask);
    read_unlock(&rwlock);
}

int psr_set_val()
{
    ......
    write_lock(&rwlock);
    d->arch.psr_cos_ids[socket] = cos;
    write_unlock(&rwlock);
    ......
}

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.