[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] xen/sched: fix cpu hotplug



On 03.08.22 11:53, Jan Beulich wrote:
On 02.08.2022 15:36, Juergen Gross wrote:
--- a/xen/common/sched/cpupool.c
+++ b/xen/common/sched/cpupool.c
@@ -419,6 +419,8 @@ static int cpupool_alloc_affin_masks(struct affinity_masks 
*masks)
          return 0;
free_cpumask_var(masks->hard);
+    memset(masks, 0, sizeof(*masks));

FREE_CPUMASK_VAR()?

Oh, yes.


@@ -1031,10 +1041,23 @@ static int cf_check cpu_callback(
  {
      unsigned int cpu = (unsigned long)hcpu;
      int rc = 0;
+    static struct cpu_rm_data *mem;

When you mentioned your plan, I was actually envisioning a slightly
different model: Instead of doing the allocation at CPU_DOWN_PREPARE,
allocate a single instance during boot, which would never be freed.
Did you consider such, and it turned out worse? I guess the main
obstacle would be figuring an upper bound for sr->granularity, but
of course schedule_cpu_rm_alloc(), besides the allocations, also
does quite a bit of filling in values, which can't be done up front.

With sched-gran=socket sr->granularity can grow to above 100, so I'm
not sure we'd want to do that.


      switch ( action )
      {
      case CPU_DOWN_FAILED:
+        if ( system_state <= SYS_STATE_active )
+        {
+            if ( mem )
+            {
+                if ( memchr_inv(&mem->affinity, 0, sizeof(mem->affinity)) )
+                    cpupool_free_affin_masks(&mem->affinity);

I don't think the conditional is really needed - it merely avoids two
xfree(NULL) invocations at the expense of readability here. Plus -

Okay.

wouldn't this better be part of ...

+                schedule_cpu_rm_free(mem, cpu);

... this anyway?

This would add a layering violation IMHO.


@@ -1042,12 +1065,32 @@ static int cf_check cpu_callback(
      case CPU_DOWN_PREPARE:
          /* Suspend/Resume don't change assignments of cpus to cpupools. */
          if ( system_state <= SYS_STATE_active )
+        {
              rc = cpupool_cpu_remove_prologue(cpu);
+            if ( !rc )
+            {
+                ASSERT(!mem);
+                mem = schedule_cpu_rm_alloc(cpu);
+                rc = mem ? cpupool_alloc_affin_masks(&mem->affinity) : -ENOMEM;

Ah - here you actually want a non-boolean return value. No need to
change that then in the earlier patch (albeit of course a change
there could be easily accommodated here).

Along the lines of the earlier comment this 2nd allocation may also
want to move into schedule_cpu_rm_alloc(). If other users of the
function don't need the extra allocations, perhaps by adding a bool
parameter.

I could do that, but I still think this would pull cpupool specific needs
into sched/core.c.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.