[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] dom_ids array implementation.



On 17-04-26 04:04:15, Jan Beulich wrote:
> >>> On 20.04.17 at 07:38, <yi.y.sun@xxxxxxxxxxxxxxx> wrote:
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -125,6 +125,8 @@ struct feat_node {
> >      uint32_t cos_reg_val[MAX_COS_REG_CNT];
> >  };
> >  
> > +#define PSR_DOM_IDS_NUM ((DOMID_IDLE + 1) / sizeof(uint32_t))
> 
> Instead of this, please use ...
> 
> > @@ -134,9 +136,11 @@ struct feat_node {
> >   *             COS ID. Every entry of cos_ref corresponds to one COS ID.
> >   */
> >  struct psr_socket_info {
> > -    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
> >      spinlock_t ref_lock;
> > +    spinlock_t dom_ids_lock;
> > +    struct feat_node *features[PSR_SOCKET_MAX_FEAT];
> >      unsigned int cos_ref[MAX_COS_REG_CNT];
> > +    uint32_t dom_ids[PSR_DOM_IDS_NUM];
> 
> ... DECLARE_BITMAP() here.
> 
> Also please try to space apart the two locks, to avoid false cacheline
> conflicts (e.g. the new lock may well go immediately before the array
> it pairs with).
> 
Got it, thanks a lot for the suggestions!

> > @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket)
> >       */
> >      for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> >      {
> > -        if ( !info->features[i] )
> > -            continue;
> > -
> >          xfree(info->features[i]);
> >          info->features[i] = NULL;
> >      }
> > +
> > +    spin_lock(&info->ref_lock);
> > +    memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int));
> > +    spin_unlock(&info->ref_lock);
> > +
> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> > +    memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t));
> 
> bitmap_clear()
> 
> I'm also not convinced you need to acquire either of the two locks
> here - you're cleaning up the socket after all, so nothing can be
> running on it anymore.
> 
Can domain destroy happens at the same time when a socket is offline?

> > @@ -682,9 +676,37 @@ void psr_ctxt_switch_to(struct domain *d)
> >          psr_assoc_rmid(&reg, d->arch.psr_rmid);
> >  
> >      if ( psra->cos_mask )
> > -        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
> > -                      
> > d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
> > -                      0, psra->cos_mask);
> > +    {
> > +        unsigned int socket = cpu_to_socket(smp_processor_id());
> > +        struct psr_socket_info *info = socket_info + socket;
> > +
> > +        if ( test_bit(d->domain_id, info->dom_ids) )
> 
> likely()
> 
Ok, thanks!

> > +            goto set_assoc;
> 
> I'm not convinced "goto" is reasonable to use here - this is not an
> error path. If you're afraid of the extra indentation level, make a
> helper function.
> 
Then, it seems a helper function is needed.

> > +        spin_lock(&info->dom_ids_lock);
> > +
> > +        int old_bit = test_and_set_bit(d->domain_id, info->dom_ids);
> 
> Please don't mix declarations and statements. Also bool please,
> but then again the variable isn't really needed anyway.
> 
Got it. Thanks!

> > +        /*
> > +         * If old_bit is 0, that means this is the first time the domain is
> > +         * switched to this socket or domain's COS ID has not been set 
> > since
> > +         * the socket is online. So, the domain's COS ID on this socket 
> > should
> > +         * be default value, 0. If not, that means this socket has been 
> > offline
> > +         * and the domain's COS ID has been set when the socket was 
> > online. So,
> > +         * this COS ID is invalid and we have to restore it to 0.
> > +         */
> > +        if ( d->arch.psr_cos_ids &&
> > +             old_bit == 0 &&
> > +             d->arch.psr_cos_ids[socket] != 0 )
> 
> Why don't you replicate the other two conditions in the if() trying to
> avoid taking the lock? (Especially if above you indeed intend to use
> a helper function, abstracting the full condition into another one
> would be very desirable.)
> 
Ok, will move the two conditions to above 'if()', like below.

if ( likely(test_bit(d->domain_id, info->dom_ids)) ||
     !d->arch.psr_cos_ids ||
     !d->arch.psr_cos_ids[socket] )

Accordingly, the later codes should be:

spin_lock(&info->dom_ids_lock);
set_bit(d->domain_id, info->dom_ids);
d->arch.psr_cos_ids[socket] = 0;
spin_unlock(&info->dom_ids_lock);

> > +            d->arch.psr_cos_ids[socket] = 0;
> > +
> > +        spin_unlock(&info->dom_ids_lock);
> 
> And then, as a whole: As indicated before, ideally you'd keep this
> out of the context switch path altogether. What are the alternatives?
> 
To restore domains' "psr_cos_ids[socket]" to default when socket offline
happens, we have three time windows:
1. When socket is offline, in "free_socket_resources()";
2. When socket is online, in "psr_cpu_init()";
3. When context switch happens, in "psr_ctxt_switch_to()".

Option 1 and 2 have same effect and option 1 is more natural than 2. So, we can
do this restore action at "1" or "3".

I have two alternatives below. Please help to check which you think is better:
1. The first version of the patch iterates valid domain list to restore them one
by one. Per your comments, it may take much time. That is the reason I submitted
this patch to spread out the restore action of all domains. If you think
"psr_cos_ids[socket]" restore action happens in context switch path is not good,
can we use a tasklet in "free_socket_resources()" to iterate the domain list and
restore their "psr_cos_ids"?

2. Or, can we use a tasklet in "psr_ctxt_switch_to()" to do above work? The side
effect is that the domain's COS ID used in this switch is not right. The valid
COS ID may be set in next context switch.

> > @@ -1310,7 +1283,10 @@ int psr_set_val(struct domain *d, unsigned int 
> > socket,
> >       * which COS the domain is using on the socket. One domain can only use
> >       * one COS ID at same time on each socket.
> >       */
> > +    spin_lock_irqsave(&info->dom_ids_lock, flag);
> >      d->arch.psr_cos_ids[socket] = cos;
> > +    test_and_set_bit(d->domain_id, info->dom_ids);
> 
> Why test_and_ when you don't use the result?
> 
Sorry, set_bit() should be enough here.

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.