Xen project Mailing List

Re: [PATCH] x86: make "dom0_nodes=" work with credit2

Date: Wed, 27 Jul 2022 08:19:44 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kfFdOSv5zxRvwJrqNP27myQOEUmuP3OpZqLriS7N3VI=; b=kvwFTP8m8XNPbkGe0riOFTXQMqBSIh5gU0upsOUnhJohU4dl4hZSwSJlpRNAxz3ZJpwwAm+7gpR+uqIw+UzWYBD+xJpzI4q5FLCA4wtEVpzXHjFbrffmqR+uyrlECHq8ODC1IxDdu9jrSrtmpIl3v5x4798hOkwjHSua4UvwstJpSzmcV+QF1zAR8SSrFEwtpYiy6SJaiZsvN3WlcJCjOHJuIEzZSduZWGt2aCYgnnStxuvfDJ3KWyqGKRjNjdwo0eZodYRqlDlHjoZ774FwelWyjrT6qCdBCZxCLHcRmRbTtfaizLklGS/0Oa+ADjE0fTUXK/dQza0vaHf/TwThzw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jzc2ua35zDz76mr9gG6Q7Ac/6joC0x3KFw1DCoqEeBsd6HkKEeZSFKmGmjQYbI1JmxtZNDzU+9ZjWnyU7IuAIRUDiWGeYfcxkUrHOr5r1Qv1IqDDKJRdoNhT2BdvV9r5MzJkJl34NjHlvW9W+5qfrfm0JbMIp/h5JGZEeYoYY1wYq0eVJG0x4hzYfi+5XOqwJZoVwECMNwgoYaa4uoCcXbJOjcWSjBH/uXJOPSna99WnUSEKtfeJEoQbLiQvDC7zUxNOU2FhTovIU8tBcZXB0PNNPMUMX3yncBCVAp2WlZlKXOEN7CUOf7IHtz9FMxeFSb0w9eXFv8B93hF1y+TVhg==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>, "ohering@xxxxxxx" <ohering@xxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 27 Jul 2022 06:19:57 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.04.2022 16:09, Dario Faggioli wrote: > On Fri, 2022-04-29 at 14:16 +0200, Jan Beulich wrote: >> On 29.04.2022 12:52, Dario Faggioli wrote: >>>> Is that mainly >>>> to have a way to record preferences even when all preferred CPUs >>>> are >>>> offline, to be able to go back to the preferences once CPUs come >>>> back >>>> online? >>>> >>> That's another example/use case, yes. We want to record the user's >>> preference, whatever the status of the system (and of other aspects >>> of >>> the configuration) is. >>> >>> But I'm not really sure I've answered... Have I? >> >> You did. >> > Ok, great! :-) > >>> >>> If dom0_nodes is in "strict" mode, >>> we want to control hard affinity only. So we set soft to the >>> default, >>> which is "all". During operations, since hard is a subset of "all", >>> soft-affinity will be just ignored. >> >> Right - until such point that all (original) Dom0 CPUs have gone >> offline. Hence my 2nd question. >> >>> So I'm using "all" because soft-affinity is just "all", unless >>> someone >>> sets it differently. >> >> How would "someone set it differently"? Aiui you can't control both >> affinities at the same time. >> > Yeah, the argument here is basically the one that I put below, and that > you say you understand. I guess I could have put it a bit more upfront, > sorry about that. :-) > >>>> >>>> Hmm, you leave this alone. Wouldn't it be better to further >>>> generalize >>>> things, in case domain affinity was set already? I was referring >>>> to >>>> the mask calculated by sched_select_initial_cpu() also in this >>>> regard. >>>> And when I did suggest to re-use the result, I did mean this >>>> literally. >>>> >>> Technically, I think we can do that. Although, it's probably >>> cumbersome >>> to do, without adding at least one cpumask on the stack, or >>> reshuffle >>> the locking between sched_select_initial_cpu() and >>> sched_init_vcpu(), >>> in a way that I (personally) don't find particularly pretty. >> >> Locking? sched_select_initial_cpu() calculates into a per-CPU >> variable, >> which I sincerely hope cannot be corrupted by another CPU. >> > No, not by another CPU, hopefully. > > And this is probably fine, during boot, when there should be no (other) > scheduling activity. However, during normal operation, a vCPU being > scheduled on CPU X, or in general having X in v->processor, could be > using the scratch cpumask of X already. So, if we use it without > locking, we'd risk using the wrong mask. > > Therefore, we require the scheduler lock to be held, for playing with > the scratch cpumasks: > > /* > * Scratch space, for avoiding having too many cpumask_t on the stack. > * Within each scheduler, when using the scratch mask of one pCPU: > * - the pCPU must belong to the scheduler, > * - the caller must own the per-pCPU scheduler lock (a.k.a. runqueue > * lock). > */ > DECLARE_PER_CPU(cpumask_t, cpumask_scratch); > #define cpumask_scratch (&this_cpu(cpumask_scratch)) > #define cpumask_scratch_cpu(c) (&per_cpu(cpumask_scratch, c)) > > And sched_init_vcpu() (and hence sched_select_initial_cpu()) can be > called during normal operation. > > In fact, sched_select_initial_cpu() does pcpu_schedule_lock_irqsave() > before starting using it. > >> >>> And again, soft and hard affinity should be set to what the user >>> wants >>> and asks for. And if, for instance, he/she passes >>> dom0_nodes="1,strict", soft-affinity should just be all. If, e.g., >>> we >>> set both hard and soft affinity to the CPUs of node 1, and if later >>> hard affinity is manually changed to "all", soft affinity will >>> remain >>> to node 1, even if it was never asked for it to be that way, and >>> the >>> user will need to change that explicitly as well. (Of course, it's >>> not >>> particularly clever to boot with dom0_nodes="1,strict" and then >>> change >>> dom0's vCPUs' hard affinity to node 0... but the user is free to do >>> that.) >> >> I can certainly accept this as justification for using "all" further >> up. >> > Good then. I notice this issue is still open. May I ask what the plans are here? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.