[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Xen: credit2: avoid picking a spurious idle unit when caps are used


  • To: Dario Faggioli <dfaggioli@xxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 4 Aug 2021 09:37:29 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Eg2yYcuN9CG5ndvKCgGUYbie0vW/nD19AKOKL5s1sqk=; b=eFq0nuf1AsmsJ96Lk5daf6SRZIl44PiYHb1qX7xv/875hyvkHU5tArnPTYvFOtaEuRRrge7XDH+lKwISvsaVs89N57vRlCpIjMJvl27WT67ELL/39J91LgwHOcbNAm1FJqF98EP/JtEeTPYkQJCU/UofNX1suL8TvnQFrY8g+7MkqVp2fim4tj865q+zRqVeoxZJaVHO8mdOn2N1Pbgp6WyI15199RjJ0ytVGfB9OyxSsahlhLvJaIuqVHAstHOGDCksBLdo9kokyqxq38T9gEWl7gbaLoANJBqjR4HdpHpVsoGFKcZNAsvgIos07NQ4wrLewVVPvZXkEBY1pCkM0Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ym5sJLLjEFPUkzTWRWtDsRx95xEJcXxPi80q2xaMd3X5kJNs/a6ta5u38dABVezXbU2YKBb6ek4oemPK9rkL8+j7QQtnu5FNNgYPilaDnBehesgxkfUba53cKhvPg+1hLl4lh5LDocU0ZOrAv4NkeFBNzAbUqZb8P7OX+WRXJgD48/I2z4KEKG8JQgsURbs5A2tRuMXE/NiGK0kLHc75dwexcQzgDvtJbIdPUtio32Qen1em2lIgJ7w3lK/AeDRekG1JpxxMRHNLqsgQMc3f3aPs2eWL3l3Kh9f854Rmmod9gNMyfJPAAT3UDwI3IYrcjr9SI2J7mjbAmT/3qeFQ+g==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: George Dunlap <george.dunlap@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 04 Aug 2021 07:37:45 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 03.08.2021 19:36, Dario Faggioli wrote:
> Commit 07b0eb5d0ef0 ("credit2: make sure we pick a runnable unit from the
> runq if there is one") did not fix completely the problem of potentially
> selecting a scheduling unit that will then not be able to run.
> 
> In fact, in case caps are used and the unit we are currently looking
> at, during the runqueue scan, does not have budget to be executed, we
> should continue looking instead than giving up and picking the idle
> unit.
> 
> Signed-off-by: Dario Faggioli <dfaggioli@xxxxxxxx>
> Suggested-by: George Dunlap <george.dunlap@xxxxxxxxxx>

Minor remark: Generally I think the order of tags should follow the
timeline: Suggestions (or bug reports) come before patch creation,
which in turns comes before reviewing / acking of a patch.

> Cc: George Dunlap <george.dunlap@xxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>

Since George is on leave and since I was Cc-ed, I thought I'd make an
attempt at reviewing this. The more that ...

> ---
> This is necessary to completely fix the bug that was described in and
> addressed by 07b0eb5d0ef0 ("credit2: make sure we pick a runnable unit
> from the runq if there is one").
> 
> It should, therefore, be backported and applied to all the branches to
> which that commit has been. About backports, it should be
> straigthforward to do that until 4.13.

... for 4.13.4 it would of course be nice to have it in. Things look
plausible overall, but I've got one question which - despite concerning
code you only move - may play into the underlying issue.

> For 4.12 and earlier, it's trickier, but the fix is still necessary.
> Actually, both 07b0eb5d0ef0 and this patch should be backported to that
> branch!

Depends on what you target with this remark: For downstreams - yes. The
stable upstream branch, otoh, is out of general support, and since this
is not a security fix it is not going to be applied to that tree.

> --- a/xen/common/sched/credit2.c
> +++ b/xen/common/sched/credit2.c
> @@ -3463,48 +3463,61 @@ runq_candidate(struct csched2_runqueue_data *rqd,
>                          (unsigned char *)&d);
>          }
>  
> -        /* Skip non runnable units that we (temporarily) have in the runq */
> -        if ( unlikely(!unit_runnable_state(svc->unit)) )
> -            continue;
> -
> -        /* Only consider vcpus that are allowed to run on this processor. */
> -        if ( !cpumask_test_cpu(cpu, svc->unit->cpu_hard_affinity) )
> -            continue;
> -
>          /*
> -         * If an unit is meant to be picked up by another processor, and such
> -         * processor has not scheduled yet, leave it in the runqueue for him.
> +         * If the unit in the runqueue has more credit than current (or than
> +         * idle, if current is not runnable) or if current is yielding, we 
> may
> +         * want to pick it up.
>           */
> -        if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu &&
> -             cpumask_test_cpu(svc->tickled_cpu, &rqd->tickled) )
> +        if ( (yield || svc->credit > snext->credit) )

The "credit" field is plain "int", i.e. signed. Idle domain's vCPU-s
don't get INT_MIN credit afaict (there's only one use of INT_MIN
throughout the entire file). Hence I can't see why in principle a
vCPU of an ordinary domain couldn't have equal or less credit than
the CPU's idle vCPU. I therefore wonder whether "yield" shouldn't be
set to true whenever snext != scurr (or - see the top of the
function - is_idle_unit() returns true), to make sure this if()'s
body gets entered in such a case.

As a nit, there's no need for the inner parentheses (anymore).

>          {
> -            SCHED_STAT_CRANK(deferred_to_tickled_cpu);
> -            continue;
> -        }
> +            /* Skip non runnable units that we (temporarily) have in the 
> runq */
> +            if ( unlikely(!unit_runnable_state(svc->unit)) )
> +                continue;
>  
> -        /*
> -         * If this is on a different processor, don't pull it unless
> -         * its credit is at least CSCHED2_MIGRATE_RESIST higher.
> -         */
> -        if ( sched_unit_master(svc->unit) != cpu
> -             && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit )
> -        {
> -            SCHED_STAT_CRANK(migrate_resisted);
> -            continue;
> -        }
> +            /* Only consider vcpus that are allowed to run on this 
> processor. */
> +            if ( !cpumask_test_cpu(cpu, svc->unit->cpu_hard_affinity) )
> +                continue;
>  
> -        /*
> -         * If the one in the runqueue has more credit than current (or idle,
> -         * if current is not runnable), or if current is yielding, and also
> -         * if the one in runqueue either is not capped, or is capped but has
> -         * some budget, then choose it.
> -         */
> -        if ( (yield || svc->credit > snext->credit) &&
> -             (!has_cap(svc) || unit_grab_budget(svc)) )
> -            snext = svc;
> +            /*
> +             * If an unit is meant to be picked up by another processor, and 
> such

Nit: As you move/re-indent (and hence touch) this, would you mind also
replacing "an" by "a"? I'm less certain about "such" and ...

> +             * processor has not scheduled yet, leave it in the runqueue for 
> him.

... "him", but to me "that" and "it" respectively would seem more
suitable.

> +             */
> +            if ( svc->tickled_cpu != -1 && svc->tickled_cpu != cpu &&
> +                 cpumask_test_cpu(svc->tickled_cpu, &rqd->tickled) )
> +            {
> +                SCHED_STAT_CRANK(deferred_to_tickled_cpu);
> +                continue;
> +            }
>  
> -        /* In any case, if we got this far, break. */
> -        break;
> +            /*
> +             * If this is on a different processor, don't pull it unless
> +             * its credit is at least CSCHED2_MIGRATE_RESIST higher.
> +             */
> +            if ( sched_unit_master(svc->unit) != cpu
> +                 && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit )

Again, despite the code just getting moved/re-indented, please correct
style here ("&&" to be moved to the end of the earlier line) as you
touch the code.

Otoh I'm having trouble seeing why all of this code movement / re-
indentation is necessary in the first place: If the initial if() was
inverted to

        if ( !yield && svc->credit <= snext->credit )
            continue;

less code churn would result afaict, and then the backports likely also
would become less involved (plus my stylistic remarks might evaporate,
as the affected code may then remain untouched altogether).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.