Re: [Xen-devel] [PATCH v2 21/48] xen/sched: use sched_resource cpu instead smp_processor_id in schedulers

On 09.09.19 16:17, Jan Beulich wrote:
On 09.08.2019 16:58, Juergen Gross wrote:
Especially in the do_schedule() functions of the different schedulers
using smp_processor_id() for the local cpu number is correct only if
the sched_unit is a single vcpu. As soon as larger sched_units are
used most uses should be replaced by the cpu number of the local
sched_resource instead.

I have to admit that I don't follow this argument, not the least because
(as I think I had indicated before) it is unclear to me what _the_ (i.e.
single) CPU for a sched unit is. I've gone back to patches 4 and 7
without finding what the conceptual model behind this is intended to be.
Besides an explanation I think one or both of those two patches also
want to be revisited wrt the use of the name "processor" for the
respective field.

Fair point. Naming it "master_cpu" in struct sched_resource and when
referencing it seems to be a good idea.

--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1684,7 +1684,7 @@ csched_load_balance(struct csched_private *prv, int cpu,
      int peer_cpu, first_cpu, peer_node, bstep;
      int node = cpu_to_node(cpu);
- BUG_ON( cpu != sched_unit_cpu(snext->unit) );
+    BUG_ON( sched_get_resource_cpu(cpu) != sched_unit_cpu(snext->unit) );

In cases like this one, would you mind dropping the stray blanks
immediately inside the parentheses?

Will do.

@@ -1825,8 +1825,9 @@ static struct task_slice
      const struct scheduler *ops, s_time_t now, bool_t tasklet_work_scheduled)
-    const int cpu = smp_processor_id();
-    struct list_head * const runq = RUNQ(cpu);
+    const unsigned int cpu = smp_processor_id();
+    const unsigned int sched_cpu = sched_get_resource_cpu(cpu);
+    struct list_head * const runq = RUNQ(sched_cpu);

By retaining a local variable named "cpu" you make it close to
impossible to notice, during a re-base, an addition to the
function still referencing a variable of this name. Similarly
review is being made harder because one needs to go hunt all
the remaining uses of "cpu". For example there a trace entry
being generated, and it's not obvious to me whether this wouldn't
better also used sched_cpu.

Okayy, I'll rename "cpu" to "my_cpu".

I used cpu in the trace entry on purpose, as it might be interesting on
which cpu the entry has been produced.

@@ -1967,7 +1968,7 @@ csched_schedule(
      if ( snext->pri > CSCHED_PRI_TS_OVER )
-        snext = csched_load_balance(prv, cpu, snext, &ret.migrated);
+        snext = csched_load_balance(prv, sched_cpu, snext, &ret.migrated);

And in a case like this one I wonder whether passing a "sort of
CPU" isn't sufficiently confusing, compared to e.g. simply
passing the corresponding unit.

I guess you mean sched_resource.

I don't think changing the parameter type is a good idea. We need both
(resource and cpu number) on caller and callee side, but the main
object csched_load_balance() is working on is the cpu number.

@@ -1975,12 +1976,12 @@ csched_schedule(
      if ( !tasklet_work_scheduled && snext->pri == CSCHED_PRI_IDLE )
-        if ( !cpumask_test_cpu(cpu, prv->idlers) )
-            cpumask_set_cpu(cpu, prv->idlers);
+        if ( !cpumask_test_cpu(sched_cpu, prv->idlers) )
+            cpumask_set_cpu(sched_cpu, prv->idlers);
-    else if ( cpumask_test_cpu(cpu, prv->idlers) )
+    else if ( cpumask_test_cpu(sched_cpu, prv->idlers) )
-        cpumask_clear_cpu(cpu, prv->idlers);
+        cpumask_clear_cpu(sched_cpu, prv->idlers);

And this looks to be a pretty gross abuse of CPU masks then.
(Nevertheless I can see that using a CPU as a vehicle here is
helpful to limit the scope of the already long series, but I
think it needs to be made much more apparent what is meant.)

I don't think it is an abuse. Think of it as a cpumask where only
the bits related to the resource's master_cpus can be set.

--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -112,7 +112,7 @@ static struct task_slice sched_idle_schedule(
      const unsigned int cpu = smp_processor_id();
      struct task_slice ret = { .time = -1 };
- ret.task = sched_idle_unit(cpu);
+    ret.task = sched_idle_unit(sched_get_resource_cpu(cpu));

Shouldn't sched_idle_unit(cpu) == sched_idle_unit(sched_get_resource_cpu(cpu))



