[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus


  • To: Dario Faggioli <dfaggioli@xxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Wed, 17 Jul 2019 17:04:22 +0100
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=george.dunlap@xxxxxxxxxx; spf=Pass smtp.mailfrom=George.Dunlap@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABtCRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT6JAlcEEwEKAEECGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEWIQTXqBy2bTNXPzpOYFimNjwxBZC0bQUCXEowWQUJDCJ7dgAKCRCmNjwx BZC0beKvEACJ75YlJXd7TnNHgFyiCJkm/qPeoQ3sFGSDZuZh7SKcdt9+3V2bFEb0Mii1hQaz 3hRqZb8sYPHJrGP0ljK09k3wf8k3OuNxziLQBJyzvn7WNlE4wBEcy/Ejo9TVBdA4ph5D0YaZ nqdsPmxe/xlTFuSkgu4ep1v9dfVP1TQR0e+JIBa/Ss+cKC5intKm+8JxpOploAHuzaPu0L/X FapzsIXqgT9eIQeBEgO2hge6h9Jov3WeED/vh8kA7f8c6zQ/gs5E7VGALwsiLrhr0LZFcKcw kI3oCCrB/C/wyPZv789Ra8EXbeRSJmTjcnBwHRPjnjwQmetRDD1t+VyrkC6uujT5jmgOBzaj KCqZ8PcMAssOzdzQtKmjUQ2b3ICPs2X13xZ5M5/OVs1W3TG5gkvMh4YoHi4ilFnOk+v3/j7q 65FG6N0JLb94Ndi80HkIOQQ1XVGTyu6bUPaBg3rWK91Csp1682kD/dNVF3FKHrRLmSVtmEQR 5rK0+VGc/FmR6vd4haKGWIRuPxzg+pBR77avIZpU7C7+UXGuZ5CbHwIdY8LojJg2TuUdqaVj yxmEZLOA8rVHipCGrslRNthVbJrGN/pqtKjCClFZHIAYJQ9EGLHXLG9Pj76opfjHij3MpR3o pCGAh6KsCrfrsvjnpDwqSbngGyEVH030irSk4SwIqZ7FwLkBDQRUWmc6AQgAzpc8Ng5Opbrh iZrn69Xr3js28p+b4a+0BOvC48NfrNovZw4eFeKIzmI/t6EkJkSqBIxobWRpBkwGweENsqnd 0qigmsDw4N7J9Xx0h9ARDqiWxX4jr7u9xauI+CRJ1rBNO3VV30QdACwQ4LqhR/WA+IjdhyMH wj3EJGE61NdP/h0zfaLYAbvEg47/TPThFsm4m8Rd6bX7RkrrOgBbL/AOnYOMEivyfZZKX1vv iEemAvLfdk2lZt7Vm6X/fbKbV8tPUuZELzNedJvTTBS3/l1FVz9OUcLDeWhGEdlxqXH0sYWh E9+PXTAfz5JxKH+LMetwEM8DbuOoDIpmIGZKrZ+2fQARAQABiQNbBBgBCgAmAhsCFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKMJ4FCQnQ/OQBKcBdIAQZAQoABgUCVFpnOgAKCRCyFcen x4Qb7cXrCAC0qQeEWmLa9oEAPa+5U6wvG1t/mi22gZN6uzQXH1faIOoDehr7PPESE6tuR/vI CTTnaSrd4UDPNeqOqVF07YexWD1LDcQG6PnRqC5DIX1RGE3BaSaMl2pFJP8y+chews11yP8G DBbxaIsTcHZI1iVIC9XLhoeegWi84vYc8F4ziADVfowbmbvcVw11gE8tmALCwTeBeZVteXjh 0OELHwrc1/4j4yvENjIXRO+QLIgk43kB57Upr4tP2MEcs0odgPM+Q+oETOJ00xzLgkTnLPim C1FIW2bOZdTj+Uq6ezRS2LKsNmW+PRRvNyA5ojEbA/faxmAjMZtLdSSSeFK8y4SoCRCmNjwx BZC0bevWEACRu+GyQgrdGmorUptniIeO1jQlpTiP5WpVnk9Oe8SiLoXUhXXNj6EtzyLGpYmf kEAbki+S6WAKnzZd3shL58AuMyDxtFNNjNeKJOcl6FL7JPBIIgIp3wR401Ep+/s5pl3Nw8Ii 157f0T7o8CPb54w6S1WsMkU78WzTxIs/1lLblSMcvyz1Jq64g4OqiWI85JfkzPLlloVf1rzy ebIBLrrmjhCE2tL1RONpE/KRVb+Q+PIs5+YcZ+Q1e0vXWA7NhTWFbWx3+N6WW6gaGpbFbopo FkYRpj+2TA5cX5zW148/xU5/ATEb5vdUkFLUFVy5YNUSyeBHuaf6fGmBrDc47rQjAOt1rmyD 56MUBHpLUbvA6NkPezb7T6bQpupyzGRkMUmSwHiLyQNJQhVe+9NiJJvtEE3jol0JVJoQ9WVn FAzPNCgHQyvbsIF3gYkCYKI0w8EhEoH5FHYLoKS6Jg880IY5rXzoAEfPvLXegy6mhYl+mNVN QUBD4h9XtOvcdzR559lZuC0Ksy7Xqw3BMolmKsRO3gWKhXSna3zKl4UuheyZtubVWoNWP/bn vbyiYnLwuiKDfNAinEWERC8nPKlv3PkZw5d3t46F1Dx0TMf16NmP+azsRpnMZyzpY8BL2eur feSGAOB9qjZNyzbo5nEKHldKWCKE7Ye0EPEjECS1gjKDwbkBDQRUWrq9AQgA7aJ0i1pQSmUR 6ZXZD2YEDxia2ByR0uZoTS7N0NYv1OjU8v6p017u0Fco5+Qoju/fZ97ScHhp5xGVAk5kxZBF DT4ovJd0nIeSr3bbWwfNzGx1waztfdzXt6n3MBKr7AhioB1m+vuk31redUdnhbtvN7O40MC+ fgSk5/+jRGxY3IOVPooQKzUO7M51GoOg4wl9ia3H2EzOoGhN2vpTbT8qCcL92ZZZwkBRldoA Wn7c1hEKSTuT3f1VpSmhjnX0J4uvKZ1V2R7rooKJYFBcySC0wa8aTmAtAvLgfcpe+legOtgq DKzLuN45xzEjyjCiI521t8zxNMPJY9FiCPNv0sCkDwARAQABiQI8BBgBCgAmAhsMFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKNJYFCQnQrVkACgkQpjY8MQWQtG2Xxg//RrRP+PFYuNXt 9C5hec/JoY24TkGPPd2tMC9usWZVImIk7VlHlAeqHeE0lWU0LRGIvOBITbS9izw6fOVQBvCA Fni56S12fKLusWgWhgu03toT9ZGxZ9W22yfw5uThSHQ4y09wRWAIYvhJsKnPGGC2KDxFvtz5 4pYYNe8Icy4bwsxcgbaSFaRh+mYtts6wE9VzyJvyfTqbe8VrvE+3InG5rrlNn51AO6M4Wv20 iFEgYanJXfhicl0WCQrHyTLfdB5p1w+072CL8uryHQVfD0FcDe+J/wl3bmYze+aD1SlPzFoI MaSIXKejC6oh6DAT4rvU8kMAbX90T834Mvbc3jplaWorNJEwjAH/r+v877AI9Vsmptis+rni JwUissjRbcdlkKBisoUZRPmxQeUifxUpqgulZcYwbEC/a49+WvbaYUriaDLHzg9xisijHwD2 yWV8igBeg+cmwnk0mPz8tIVvwi4lICAgXob7HZiaqKnwaDXs4LiS4vdG5s/ElnE3rIc87yru 24n3ypeDZ6f5LkdqL1UNp5/0Aqbr3EiN7/ina4YVyscy9754l944kyHnnMRLVykg0v+kakj0 h0RJ5LbfLAMM8M52KIA3y14g0Fb7kHLcOUMVcgfQ3PrN6chtC+5l6ouDIlSLR3toxH8Aam7E rIFfe2Dk+lD9A9BVd2rfoHA=
  • Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Delivery-date: Wed, 17 Jul 2019 16:04:29 +0000
  • Ironport-sdr: 6hyK03F3hxy2QDyo7KT5Y0vNyHRGBPcTq4+fcjnI8B1GytyYEkBXrphN+bOcAp5l0pBKkfULIa bP2QESx5tsI4ChJ2mSUSRoTosjAPsA/7JO2Xw2lX+FCXlvGNrVkcWLcCbD1blQ+XVNThGvWGfj tR/ZeQpQI3dQuFSZ1a6d3sjp1FJxKDdwG++Tu5nGV+DTt9NagcyFQSpGvqgz3/84ySDbcy81ha WPvvhRvJ4NyM35HYjXwi64Ye3IHxz2H7+EoMjQtzWDOWrx8KLppSDg1xi7QY2CjDJ1OD/nT/Mq LxM=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 8/25/18 1:21 AM, Dario Faggioli wrote:
> If a pCPU has been/is being offlined, we don't want it to be neither
> assigned to any pCPU, nor in the wait list.
> 
> Therefore, when we detect that a vcpu is going offline, remove it from
> both places.

Hmm, this commit message wasn't very informative.

It looks like what you really mean to do is:

> Signed-off-by: Dario Faggioli <dfaggioli@xxxxxxxx>
> ---
> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> ---
>  xen/common/sched_null.c |   43 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
> index 1426124525..6259f4643e 100644
> --- a/xen/common/sched_null.c
> +++ b/xen/common/sched_null.c
> @@ -361,7 +361,8 @@ static void vcpu_assign(struct null_private *prv, struct 
> vcpu *v,
>      }
>  }
>  
> -static void vcpu_deassign(struct null_private *prv, struct vcpu *v)
> +/* Returns true if a cpu was tickled */
> +static bool vcpu_deassign(struct null_private *prv, struct vcpu *v)
>  {
>      unsigned int bs;
>      unsigned int cpu = v->processor;
> @@ -406,11 +407,13 @@ static void vcpu_deassign(struct null_private *prv, 
> struct vcpu *v)
>                  vcpu_assign(prv, wvc->vcpu, cpu);
>                  cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>                  spin_unlock(&prv->waitq_lock);
> -                return;
> +                return true;
>              }
>          }
>      }
>      spin_unlock(&prv->waitq_lock);
> +
> +    return false;
>  }
>  
>  /* Change the scheduler of cpu to us (null). */
> @@ -518,6 +521,14 @@ static void null_vcpu_remove(const struct scheduler 
> *ops, struct vcpu *v)
>  
>      lock = vcpu_schedule_lock_irq(v);
>  
> +    /* If offline, the vcpu shouldn't be assigned, nor in the waitqueue */
> +    if ( unlikely(!is_vcpu_online(v)) )
> +    {
> +        ASSERT(per_cpu(npc, v->processor).vcpu != v);
> +        ASSERT(list_empty(&nvc->waitq_elem));
> +        goto out;
> +    }

* Handle the case of an offline vcpu being removed (ASSERTing that it's
neither on a processor nor on the waitqueue)

But wait, isn't this fixing a important regression in patch 2?  If after
patch 2 but before patch 3, a VM is created with offline vcpus, and then
destroyed, won't the offline vcpus reach here neither on the waitlist
nor on a vcpu?

Offlining/onlining vcpus is one thing; but creating and destroying
guests is something different.

>      /* If v is in waitqueue, just get it out of there and bail */
>      if ( unlikely(!list_empty(&nvc->waitq_elem)) )
>      {
> @@ -567,11 +578,31 @@ static void null_vcpu_wake(const struct scheduler *ops, 
> struct vcpu *v)
>  
>  static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
>  {
> +    struct null_private *prv = null_priv(ops);
> +    unsigned int cpu = v->processor;
> +    bool tickled = false;
> +
>      ASSERT(!is_idle_vcpu(v));
>  
> +    /* We need to special case the handling of the vcpu being offlined */
> +    if ( unlikely(!is_vcpu_online(v)) )
> +    {
> +        struct null_vcpu *nvc = null_vcpu(v);
> +
> +        printk("YYY d%dv%d going down?\n", v->domain->domain_id, v->vcpu_id);
> +        if ( unlikely(!list_empty(&nvc->waitq_elem)) )
> +        {
> +            spin_lock(&prv->waitq_lock);
> +            list_del_init(&nvc->waitq_elem);
> +            spin_unlock(&prv->waitq_lock);
> +        }
> +        else if ( per_cpu(npc, cpu).vcpu == v )
> +            tickled = vcpu_deassign(prv, v);
> +    }

* Handle the unexpected(?) case of a vcpu being put to sleep as(?) it's
offlined

If it's not unexpected, then why the printk?

And if it is unexpected, what is the expected path for a cpu going
offline to be de-assigned from a vcpu (which is what the title seems to
imply this patch is about)?

> +
>      /* If v is not assigned to a pCPU, or is not running, no need to bother 
> */
> -    if ( curr_on_cpu(v->processor) == v )
> -        cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
> +    if ( likely(!tickled && curr_on_cpu(cpu) == v) )
> +        cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
>  
>      SCHED_STAT_CRANK(vcpu_sleep);
>  }
> @@ -615,12 +646,12 @@ static void null_vcpu_migrate(const struct scheduler 
> *ops, struct vcpu *v,
>       *
>       * In the latter, there is just nothing to do.
>       */
> -    if ( likely(list_empty(&nvc->waitq_elem)) )
> +    if ( likely(per_cpu(npc, v->processor).vcpu == v) )
>      {
>          vcpu_deassign(prv, v);
>          SCHED_STAT_CRANK(migrate_running);
>      }
> -    else
> +    else if ( !list_empty(&nvc->waitq_elem) )
>          SCHED_STAT_CRANK(migrate_on_runq);

* Teach null_vcpu_migrate() that !on_waitqueue != on_vcpu.

It looks like the comment just above this hunk is now out of date:

"v is either assigned to a pCPU, or in the waitqueue."

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.