[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH] xen/sched: fix locking in sched_tick_[suspend|resume]()
- To: Jürgen Groß <jgross@xxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
- From: George Dunlap <george.dunlap@xxxxxxxxxx>
- Date: Fri, 4 Oct 2019 17:09:03 +0100
- Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=george.dunlap@xxxxxxxxxx; spf=Pass smtp.mailfrom=George.Dunlap@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
- Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABtCRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT6JAlcEEwEKAEECGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEWIQTXqBy2bTNXPzpOYFimNjwxBZC0bQUCXEowWQUJDCJ7dgAKCRCmNjwx BZC0beKvEACJ75YlJXd7TnNHgFyiCJkm/qPeoQ3sFGSDZuZh7SKcdt9+3V2bFEb0Mii1hQaz 3hRqZb8sYPHJrGP0ljK09k3wf8k3OuNxziLQBJyzvn7WNlE4wBEcy/Ejo9TVBdA4ph5D0YaZ nqdsPmxe/xlTFuSkgu4ep1v9dfVP1TQR0e+JIBa/Ss+cKC5intKm+8JxpOploAHuzaPu0L/X FapzsIXqgT9eIQeBEgO2hge6h9Jov3WeED/vh8kA7f8c6zQ/gs5E7VGALwsiLrhr0LZFcKcw kI3oCCrB/C/wyPZv789Ra8EXbeRSJmTjcnBwHRPjnjwQmetRDD1t+VyrkC6uujT5jmgOBzaj KCqZ8PcMAssOzdzQtKmjUQ2b3ICPs2X13xZ5M5/OVs1W3TG5gkvMh4YoHi4ilFnOk+v3/j7q 65FG6N0JLb94Ndi80HkIOQQ1XVGTyu6bUPaBg3rWK91Csp1682kD/dNVF3FKHrRLmSVtmEQR 5rK0+VGc/FmR6vd4haKGWIRuPxzg+pBR77avIZpU7C7+UXGuZ5CbHwIdY8LojJg2TuUdqaVj yxmEZLOA8rVHipCGrslRNthVbJrGN/pqtKjCClFZHIAYJQ9EGLHXLG9Pj76opfjHij3MpR3o pCGAh6KsCrfrsvjnpDwqSbngGyEVH030irSk4SwIqZ7FwLkBDQRUWmc6AQgAzpc8Ng5Opbrh iZrn69Xr3js28p+b4a+0BOvC48NfrNovZw4eFeKIzmI/t6EkJkSqBIxobWRpBkwGweENsqnd 0qigmsDw4N7J9Xx0h9ARDqiWxX4jr7u9xauI+CRJ1rBNO3VV30QdACwQ4LqhR/WA+IjdhyMH wj3EJGE61NdP/h0zfaLYAbvEg47/TPThFsm4m8Rd6bX7RkrrOgBbL/AOnYOMEivyfZZKX1vv iEemAvLfdk2lZt7Vm6X/fbKbV8tPUuZELzNedJvTTBS3/l1FVz9OUcLDeWhGEdlxqXH0sYWh E9+PXTAfz5JxKH+LMetwEM8DbuOoDIpmIGZKrZ+2fQARAQABiQNbBBgBCgAmAhsCFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKMJ4FCQnQ/OQBKcBdIAQZAQoABgUCVFpnOgAKCRCyFcen x4Qb7cXrCAC0qQeEWmLa9oEAPa+5U6wvG1t/mi22gZN6uzQXH1faIOoDehr7PPESE6tuR/vI CTTnaSrd4UDPNeqOqVF07YexWD1LDcQG6PnRqC5DIX1RGE3BaSaMl2pFJP8y+chews11yP8G DBbxaIsTcHZI1iVIC9XLhoeegWi84vYc8F4ziADVfowbmbvcVw11gE8tmALCwTeBeZVteXjh 0OELHwrc1/4j4yvENjIXRO+QLIgk43kB57Upr4tP2MEcs0odgPM+Q+oETOJ00xzLgkTnLPim C1FIW2bOZdTj+Uq6ezRS2LKsNmW+PRRvNyA5ojEbA/faxmAjMZtLdSSSeFK8y4SoCRCmNjwx BZC0bevWEACRu+GyQgrdGmorUptniIeO1jQlpTiP5WpVnk9Oe8SiLoXUhXXNj6EtzyLGpYmf kEAbki+S6WAKnzZd3shL58AuMyDxtFNNjNeKJOcl6FL7JPBIIgIp3wR401Ep+/s5pl3Nw8Ii 157f0T7o8CPb54w6S1WsMkU78WzTxIs/1lLblSMcvyz1Jq64g4OqiWI85JfkzPLlloVf1rzy ebIBLrrmjhCE2tL1RONpE/KRVb+Q+PIs5+YcZ+Q1e0vXWA7NhTWFbWx3+N6WW6gaGpbFbopo FkYRpj+2TA5cX5zW148/xU5/ATEb5vdUkFLUFVy5YNUSyeBHuaf6fGmBrDc47rQjAOt1rmyD 56MUBHpLUbvA6NkPezb7T6bQpupyzGRkMUmSwHiLyQNJQhVe+9NiJJvtEE3jol0JVJoQ9WVn FAzPNCgHQyvbsIF3gYkCYKI0w8EhEoH5FHYLoKS6Jg880IY5rXzoAEfPvLXegy6mhYl+mNVN QUBD4h9XtOvcdzR559lZuC0Ksy7Xqw3BMolmKsRO3gWKhXSna3zKl4UuheyZtubVWoNWP/bn vbyiYnLwuiKDfNAinEWERC8nPKlv3PkZw5d3t46F1Dx0TMf16NmP+azsRpnMZyzpY8BL2eur feSGAOB9qjZNyzbo5nEKHldKWCKE7Ye0EPEjECS1gjKDwbkBDQRUWrq9AQgA7aJ0i1pQSmUR 6ZXZD2YEDxia2ByR0uZoTS7N0NYv1OjU8v6p017u0Fco5+Qoju/fZ97ScHhp5xGVAk5kxZBF DT4ovJd0nIeSr3bbWwfNzGx1waztfdzXt6n3MBKr7AhioB1m+vuk31redUdnhbtvN7O40MC+ fgSk5/+jRGxY3IOVPooQKzUO7M51GoOg4wl9ia3H2EzOoGhN2vpTbT8qCcL92ZZZwkBRldoA Wn7c1hEKSTuT3f1VpSmhjnX0J4uvKZ1V2R7rooKJYFBcySC0wa8aTmAtAvLgfcpe+legOtgq DKzLuN45xzEjyjCiI521t8zxNMPJY9FiCPNv0sCkDwARAQABiQI8BBgBCgAmAhsMFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKNJYFCQnQrVkACgkQpjY8MQWQtG2Xxg//RrRP+PFYuNXt 9C5hec/JoY24TkGPPd2tMC9usWZVImIk7VlHlAeqHeE0lWU0LRGIvOBITbS9izw6fOVQBvCA Fni56S12fKLusWgWhgu03toT9ZGxZ9W22yfw5uThSHQ4y09wRWAIYvhJsKnPGGC2KDxFvtz5 4pYYNe8Icy4bwsxcgbaSFaRh+mYtts6wE9VzyJvyfTqbe8VrvE+3InG5rrlNn51AO6M4Wv20 iFEgYanJXfhicl0WCQrHyTLfdB5p1w+072CL8uryHQVfD0FcDe+J/wl3bmYze+aD1SlPzFoI MaSIXKejC6oh6DAT4rvU8kMAbX90T834Mvbc3jplaWorNJEwjAH/r+v877AI9Vsmptis+rni JwUissjRbcdlkKBisoUZRPmxQeUifxUpqgulZcYwbEC/a49+WvbaYUriaDLHzg9xisijHwD2 yWV8igBeg+cmwnk0mPz8tIVvwi4lICAgXob7HZiaqKnwaDXs4LiS4vdG5s/ElnE3rIc87yru 24n3ypeDZ6f5LkdqL1UNp5/0Aqbr3EiN7/ina4YVyscy9754l944kyHnnMRLVykg0v+kakj0 h0RJ5LbfLAMM8M52KIA3y14g0Fb7kHLcOUMVcgfQ3PrN6chtC+5l6ouDIlSLR3toxH8Aam7E rIFfe2Dk+lD9A9BVd2rfoHA=
- Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>
- Delivery-date: Fri, 04 Oct 2019 16:09:10 +0000
- Ironport-sdr: rjqCcWyRwBAVzcs47QDQ7Cqaamgkvnt8u0nePmaWSH/EZhjc5yBN1kZJPWodJvaMgW8DXWhF1U 0SYO1nd2pXKDxwlF3JUxON5h53+GVqPzN1hSpvqxW1phlG8sb5nJLMXz5LreTlb20+EFYx5sLf N5mgiN3lxAoohymZ+hECdBKd6S5pXWzGmCobfnsiCv4lUV7wsKADPsYRHhnk5jCIp4aWHK+lzJ yJRABlN+doSDY3gNdxY2i+6yVtxYMNaeMguQsTbQZbAylyuiIzcTir1ueQj4M8k/r6nT49rjhg WTo=
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
- Openpgp: preference=signencrypt
On 10/4/19 4:40 PM, Jürgen Groß wrote:
> On 04.10.19 17:37, George Dunlap wrote:
>> On 10/4/19 4:03 PM, Jürgen Groß wrote:
>>> On 04.10.19 16:56, George Dunlap wrote:
>>>> On 10/4/19 3:43 PM, Jürgen Groß wrote:
>>>>> On 04.10.19 16:34, George Dunlap wrote:
>>>>>> On 10/4/19 3:24 PM, Jürgen Groß wrote:
>>>>>>> On 04.10.19 16:08, George Dunlap wrote:
>>>>>>>> On 10/4/19 7:40 AM, Juergen Gross wrote:
>>>>>>>>> sched_tick_suspend() and sched_tick_resume() should not call the
>>>>>>>>> scheduler specific timer handlers in case the cpu they are
>>>>>>>>> running on
>>>>>>>>> is just being moved to or from a cpupool.
>>>>>>>>>
>>>>>>>>> Use a new percpu lock for that purpose.
>>>>>>>>
>>>>>>>> Is there a reason we can't use the pcpu_schedule_lock() instead of
>>>>>>>> introducing a new one? Sorry if this is obvious, but it's been a
>>>>>>>> while
>>>>>>>> since I poked around this code.
>>>>>>>
>>>>>>> Lock contention would be higher especially with credit2 which is
>>>>>>> using a
>>>>>>> per-core or even per-socket lock. We don't care about other
>>>>>>> scheduling
>>>>>>> activity here, all we need is a guard against our per-cpu scheduler
>>>>>>> data being changed beneath our feet.
>>>>>>
>>>>>> Is this code really being called so often that we need to worry about
>>>>>> this level of contention?
>>>>>
>>>>> Its called each time idle is entered and left again.
>>>>>
>>>>> Especially with core scheduling there is a high probability of
>>>>> multiple
>>>>> cpus leaving idle at the same time and the per-scheduler lock being
>>>>> used
>>>>> in parallel already.
>>>>
>>>> Hrm, that does sound pretty bad.
>>>>
>>>>>> We already have a *lot* of locks; and in this case you're adding a
>>>>>> second lock which interacts with the per-scheduler cpu lock. This
>>>>>> just
>>>>>> seems like asking for trouble.
>>>>>
>>>>> In which way does it interact with the per-scheduler cpu lock?
>>>>>
>>>>>> I won't Nack the patch, but I don't think I would ack it without
>>>>>> clear
>>>>>> evidence that the extra lock has a performance improvement that's
>>>>>> worth
>>>>>> the cost of the extra complexity.
>>>>>
>>>>> I think complexity is lower this way. Especially considering the per-
>>>>> scheduler lock changing with moving a cpu to or from a cpupool.
>>>>
>>>> The key aspect of the per-scheduler lock is that once you hold it, the
>>>> pointer to the lock can't change.
>>>>
>>>> After this patch, the fact remains that sometimes you need to grab one
>>>> lock, sometimes the other, and sometimes both.
>>>>
>>>> And, tick_suspend() lives in the per-scheduler code. Each scheduler
>>>> has
>>>> to remember that tick_suspend and tick_resume hold a completely
>>>> different lock to the rest of the scheduling functions.
>>>
>>> Is that really so critical? Today only credit1 has tick_suspend and
>>> tick_resume hooks, and both are really very simple. I can add a
>>> comment in sched-if.h if you like.
>>>
>>> And up to now there was no lock at all involved when calling them...
>>>
>>> If you think using the normal scheduler lock is to be preferred I'd
>>> be happy to change the patch. But I should mention I was already
>>> planning to revisit usage of the scheduler lock and replace it by the
>>> new per-cpu lock where appropriate (not sure I'd find any appropriate
>>> path for replacement).
>>
>> Well the really annoying thing here is that all the other schedulers --
>> in particular, credit2, which as you say, is designed to have multiple
>> runqueues share the same lock -- have to grab & release the lock just to
>> find out that there's nothing to do.
>>
>> And even credit1 doesn't do anything particularly clever -- all it does
>> is stop and start a timer based on a scheduler-global configuration. And
>> the scheduling lock is grabbed to switch to idle anyway. It seems like
>> we should be able to do something more sensible.
>
> Yeah, I thought the same.
I can think of a couple of options:
1. Have schedule.c call s->tick_* when switching to / from idle
2. Get rid of s->tick_*, and have sched_credit.c suspend / resume ticks
when switching to / from idle in csched_schedule()
3. Have schedule.c suspend / resume ticks, and have an interface that
allows schedulers to enable / disable them.
4. Rework sched_credit to be tickless.
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|