[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

On Mon, May 11, 2015 at 8:11 AM, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote:
On Thu, 2015-05-07 at 12:05 -0500, Chong Li wrote:
> Add two hypercalls(XEN_DOMCTL_SCHEDOP_getvcpuinfo/putvcpuinfo) to get/set a domain's
> per-VCPU parameters. Hypercalls are handled in function rt_dom_cntl.
And that is because, right now, only code in sched_rt.c is able to deal
with per-vcpu parameters getting and setting.

That's of course true, but these two new hypercalls are, potentially,
generic, i.e., other schedulers may want to use them at some point. So,
why not just put them in good shape for that from the beginning?

To do so, you could with the new DOMCTLs in a similar way as
XEN_DOMCTL_SCHEDOP_{get,put}info are handled, and add a
new .adjust_vcpu(s?) hook in the scheduler interface.

> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 7c39a9e..9add5a4 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -1085,6 +1085,9 @@ rt_dom_cntl(
>Â Â Â struct list_head *iter;
>Â Â Â unsigned long flags;
>Â Â Â int rc = 0;
> +Â Â xen_domctl_sched_rtds_params_t *local_sched;
> +Â Â int vcpu_index=0;
So, what's this vcpu_index intended meaning/usage?

The vcpu_index here equals vcpu_id. Â

> @@ -1110,6 +1113,67 @@ rt_dom_cntl(
>Â Â Â Â Â }
>Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
>Â Â Â Â Â break;
> +Â Â case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
> +Â Â Â Â op->u.rtds.nr_vcpus = 0;
> +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
This gives you the number of vcpus of sdom, doesn't it? It feels rather
nasty (especially the lock being dropped and taken again below!).

Aren't there other ways to get the same information that suits your
needs (e.g., d->max_vcpus)? If not, I think you should consider adding a
'nr_vcpu' field in rt_dom, exactly as Credit2 is doing in csched2_dom.

> +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â {
> +Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +
> +Â Â Â Â Â Â local_sched[vcpu_index].budget = svc->budget / MICROSECS(1);
> +Â Â Â Â Â Â local_sched[vcpu_index].period = svc->period / MICROSECS(1);
> +Â Â Â Â Â Â local_sched[vcpu_index].index = vcpu_index;
> +Â Â Â Â Â Â vcpu_index++;
And that's why I was asking about index. As Jan is pointing out already,
used like this, this index/vcpu_index is rather useless.

I mean, you're passing up nr_vcpus structs in an array in which
the .index field of the i-eth element is equal to i. How is this
important? The caller could well iterate, count, and retrieve the
position of each elements by itself!

What you probably are after, is the vcpu id, isn't it?

Yes, it is. Now I use vcpuid instead of vcpu_index. Â

> +Â Â Â Â }
> +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â copy_to_guest(op->u.rtds.vcpus, local_sched, vcpu_index);
I'm sure we want some checks about whether we are overflowing the
userspace provided buffer (and something similar below, for put). I
appreciate that you, in this patch series, are only calling this from
libxl, which properly dimension things, etc., but that can not always be
the case.

There are several examples in the code base on the route to take for
similar operations. For example, you can try to do some checks and only
fill as much elements as the buffer allows, or you can give a special
semantic to calling the hypercall with NULL/0 as parameters, i.e., use
that for asking Xen about the proper sizes, etc.

Have a look at how XEN_SYSCTL_numainfo and XEN_SYSCTL_cputopoinfo are
implemented (in Xen, but also in libxc and libxl, to properly understand

> +Â Â case XEN_DOMCTL_SCHEDOP_putvcpuinfo:
> +Â Â Â Â local_sched = xzalloc_array(xen_domctl_sched_rtds_params_t,
> +Â Â Â Â Â Â Â Â op->u.rtds.nr_vcpus);
> +Â Â Â Â if( local_sched == NULL )
> +Â Â Â Â {
> +Â Â Â Â Â Â return -ENOMEM;
> +Â Â Â Â }
> +Â Â Â Â copy_from_guest(local_sched, op->u.rtds.vcpus, op->u.rtds.nr_vcpus);
> +
> +Â Â Â Â for( i = 0; i < op->u.rtds.nr_vcpus; i++ )
> +Â Â Â Â {
> +Â Â Â Â Â Â vcpu_index = 0;
> +Â Â Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â Â Â {
But why the nested loop? I think this is still that 'index' thing
causing problems. If you use vcpu numbers/ids, you can just use the
d->vcpu[] array, and get rid of the one of the for-s!

Look at, for instance, XEN_DOMCTL_{set,get}vcpuaffinity. You're after
something that is pretty similar (i.e., altering a per-vcpu property),
you just want to do it on more than one vcpu at a time.

> +Â Â Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +Â Â Â Â Â Â Â Â if ( local_sched[i].index == vcpu_index )
> +Â Â Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â Â Â if ( local_sched[i].period <= 0 || local_sched[i].budget <= 0 )
> +Â Â Â Â Â Â Â Â Â Â Â Â Âreturn -EINVAL;
> +
> +Â Â Â Â Â Â Â Â Â Â svc->period = MICROSECS(local_sched[i].period);
> +Â Â Â Â Â Â Â Â Â Â svc->budget = MICROSECS(local_sched[i].budget);
> +Â Â Â Â Â Â Â Â Â Â break;
> +Â Â Â Â Â Â Â Â }
> +Â Â Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â Â Â }
> +Â Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
Lock dropping and reacquiring again... This is less scary than above,
but doing it is pretty much always asking for troubles. Of course there
are legitimate use case of such a pattern, but I'd always ask myself 6
or 7 times whether I'm dealing with one of those before actually using
it. 99% of them, you aren't. :-P

Anyway, if you go with vcpu ids instead than with index, that is not a
problem, as there will be only one loop.

> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 10b51ef..490a6b6 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -342,6 +342,16 @@ struct xen_domctl_max_vcpus {
>Â typedef struct xen_domctl_max_vcpus xen_domctl_max_vcpus_t;
>Â DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
> +struct xen_domctl_sched_rtds_params {
> +Â Â /* vcpus' info */
> +Â Â uint64_t period;
> +Â Â uint64_t budget;
We settled for uint32_t while reviewing RTDS patches, and nothing
changed since then, as far as I can tell, so you should just use that as

> +Â Â uint16_t index;
> +Â Â uint16_t padding[3];
> +};
> +typedef struct xen_domctl_sched_rtds_params xen_domctl_sched_rtds_params_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_sched_rtds_params_t);
> +
>Â /* XEN_DOMCTL_scheduler_op */
>Â /* Scheduler types. */
> @@ -351,9 +361,12 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
> -/* Set or get info? */
> +/* Set or get info */
>Â #define XEN_DOMCTL_SCHEDOP_putinfo 0
>Â #define XEN_DOMCTL_SCHEDOP_getinfo 1
> +#define XEN_DOMCTL_SCHEDOP_getvcpuinfo 2
> +#define XEN_DOMCTL_SCHEDOP_putvcpuinfo 3
> +
>Â struct xen_domctl_scheduler_op {
>Â Â Â uint32_t sched_id;Â /* XEN_SCHEDULER_* */
>Â Â Â uint32_t cmd;Â Â Â Â/* XEN_DOMCTL_SCHEDOP_* */
> @@ -373,8 +386,11 @@ struct xen_domctl_scheduler_op {
>Â Â Â Â Â Â Â uint16_t weight;
>Â Â Â Â Â } credit2;
>Â Â Â Â Â struct xen_domctl_sched_rtds {
> -Â Â Â Â Â Â uint32_t period;
> -Â Â Â Â Â Â uint32_t budget;
> +Â Â Â Â Â Â uint64_t period;
> +Â Â Â Â Â Â uint64_t budget;

> +Â Â Â Â Â Â XEN_GUEST_HANDLE_64(xen_domctl_sched_rtds_params_t) vcpus;
> +Â Â Â Â Â Â uint16_t nr_vcpus;
> +Â Â Â Â Â Â uint16_t padding[3];
>Â Â Â Â Â } rtds;
I don't like this. I find it both confusing and redundant as an

In fact, this way, we have both budget, period and an array of budget
and period. What is the meaning of this? What if one fills both the
budget and period members and a couple of array elements?

You should either adapt xen_domctl_sched_rtds to be able to deal with
both per-domain ad per-vcpu parameter getting/setting, but with less
redundancy and a more clear and better defined semantic, or introduce a
new member of the union u to with the array (and the number of elements)
in it (e.g., xen_domctl_sched_rtds_vcpus), or (and that's what I like
most) just go with something completely new, e.g.,

#define XEN_DOMCTL_SCHEDOP_getvcpuinfo 2
#define XEN_DOMCTL_SCHEDOP_putvcpuinfo 3
struct xen_domctl_scheduler_vcpu_op {
  uint32_t cmd;   Â/* XEN_DOMCTL_SCHEDOP_{get,put}vcpuinfo */
  union {
    struct xen_domctl_sched_rtds_vcpus {
      XEN_GUEST_HANDLE_64(xen_domctl_sched_rtds_params_t) vcpus
      uint16_t nr_vcpus;
      uint16_t padding[3];
    } rtds;
  } u;

Yes, I think so. These changes will appear in the next version of the patch.Â


Chong Li
Department of Computer Science and Engineering
Washington University in St.louis
Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.