|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/sched: rework credit2 run-queue allocation
On Thu, 2020-01-23 at 09:55 +0100, Juergen Gross wrote:
> Currently the memory for each run-queue of the credit2 scheduler is
> allocated at the scheduler's init function: for each cpu in the
> system
> a struct csched2_runqueue_data is being allocated, even if the
> current scheduler only handles one physical cpu or is configured to
> work with a single run-queue. As each struct contains 4 cpumasks this
> sums up to rather large memory sizes pretty fast.
>
Ok, I finally found the time to look at this... And I like it. :-)
> In fact this fixes a bug in credit2 related to run-queue handling:
> cpu_to_runqueue() will return the first free or matching run-queue,
> which ever is found first. So in case a cpu is removed from credit2
> this could result in e.g. run-queue 0 becoming free, so when another
> cpu is added it will in any case be assigned to that free run-queue,
> even if it would have found another run-queue matching later.
>
That's a good catch... Thanks!
So, I only have a request, and a question:
> --- a/xen/common/sched/credit2.c
> +++ b/xen/common/sched/credit2.c
> @@ -849,51 +822,71 @@ static inline bool same_core(unsigned int cpua,
> unsigned int cpub)
> cpu_to_core(cpua) == cpu_to_core(cpub);
> }
>
> -static unsigned int
> -cpu_to_runqueue(const struct csched2_private *prv, unsigned int cpu)
> +static struct csched2_runqueue_data *
> +cpu_add_to_runqueue(struct csched2_private *prv, unsigned int cpu)
> {
> - const struct csched2_runqueue_data *rqd;
> - unsigned int rqi;
> + struct csched2_runqueue_data *rqd, *rqd_new;
> + struct list_head *rqd_ins;
> + unsigned long flags;
> + int rqi = 0;
> + bool rqi_unused = false, rqd_valid = false;
> +
> + rqd_new = xzalloc(struct csched2_runqueue_data);
>
>
So, I'm not sure I see why it's better to allocating this here, and
then free it if we didn't need it, instead than allocating it later,
only if we actually need it... What am I missing? :-)
> - for ( rqi = 0; rqi < nr_cpu_ids; rqi++ )
> + write_lock_irqsave(&prv->lock, flags);
> +
> + rqd_ins = &prv->rql;
> + list_for_each_entry ( rqd, &prv->rql, rql )
> {
> unsigned int peer_cpu;
>
> - /*
> - * As soon as we come across an uninitialized runqueue, use
> it.
> - * In fact, either:
> - * - we are initializing the first cpu, and we assign it to
> - * runqueue 0. This is handy, especially if we are
> dealing
> - * with the boot cpu (if credit2 is the default
> scheduler),
> - * as we would not be able to use cpu_to_socket() and
> similar
> - * helpers anyway (they're result of which is not
> reliable yet);
> - * - we have gone through all the active runqueues, and
> have not
> - * found anyone whose cpus' topology matches the one we
> are
> - * dealing with, so activating a new runqueue is what we
> want.
> - */
> - if ( prv->rqd[rqi].id == -1 )
> - break;
> + /* Remember first unused queue index. */
> + if ( !rqi_unused && rqd->id > rqi )
> + rqi_unused = true;
>
> - rqd = prv->rqd + rqi;
> - BUG_ON(cpumask_empty(&rqd->active));
> -
> - peer_cpu = cpumask_first(&rqd->active);
> + peer_cpu = rqd->pick_bias;
> BUG_ON(cpu_to_socket(cpu) == XEN_INVALID_SOCKET_ID ||
> cpu_to_socket(peer_cpu) == XEN_INVALID_SOCKET_ID);
>
> - if (opt_runqueue == OPT_RUNQUEUE_CPU)
> - continue;
> if ( opt_runqueue == OPT_RUNQUEUE_ALL ||
> (opt_runqueue == OPT_RUNQUEUE_CORE &&
> same_core(peer_cpu, cpu)) ||
> (opt_runqueue == OPT_RUNQUEUE_SOCKET &&
> same_socket(peer_cpu, cpu)) ||
> (opt_runqueue == OPT_RUNQUEUE_NODE &&
> same_node(peer_cpu, cpu)) )
> + {
> + rqd_valid = true;
> break;
> + }
>
So, OPT_RUNQUEUE_CPU is just disappearing. If I understood the
algorithm correctly, that is because in such case we just scan through
the whole list, without finding any match, and the we'll allocate a new
runqueue (while, for any of the other options, we stop as soon as we
found a runqueue with a CPU inside it which match the criteria).
Can we add a comment about this. Not necessary to describe the
algorithm in details, I don't think... just a few words, especially
about the fact that the enum has a _CPU item that, at a first and quick
look, we seem to be ignoring here?
Thanks and Regards
--
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)
Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |