[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86: make "dom0_nodes=" work with credit2
On 29.04.2022 12:52, Dario Faggioli wrote: > On Wed, 2022-04-13 at 12:00 +0200, Jan Beulich wrote: >> I also have a more general question here: sched.h says "Bitmask of >> CPUs >> on which this VCPU may run" for hard affinity and "Bitmask of CPUs on >> which this VCPU prefers to run" for soft affinity. Additionally >> there's >> soft_aff_effective. Does it make sense in the first place for one to >> be >> a proper subset of the of the other in _both_ directions? >> > I'm not sure I'm 100% getting what you're asking. In particular, I'm > not sure what you mean with "for one to be a propper subset of the > other in both directions"? > > Anyway, soft and hard affinity are under the complete control of the > user (I guess we can say that they're policy), so we tend to accept > pretty much everything that comes from the user. > > That is, the user can set an hard affinity to 1-6 and a soft affinity > of (a) 2-3, (b) 0-2, (c) 7-12, etc. > > Case (a), i.e., soft is a strict subset of hard, is the one that makes > the most sense, of course. With this configuration, the vCPU(s) can run > on CPUs 1, 2, 3, 4, 5 and 6, but the scheduler will prefer to run it > (them) on 2 and/or 3. > > Case (b), i.e., no strict subset, but there's some overlap, also means > that soft-affinity is going to be considered and have an effect. In > fact, vCPU(s) will prefer to run on CPUs 1 and/or 2, but of course it > (they) will never run on CPU 0. Of course, the user can, at a later > point in time, change the hard affinity so that it includes CPU 0, and > we'll be back to the strict-subset case. So that's way we want to keep > 0 in the mast, even if it causes soft to not be a strict subset of > hard. > > In case (c), soft affinity is totally useless. However, again, the user > can later change hard to include some or all CPUs 7-12, so we keep it. > We do, however, print a warning. And we also use the soft_aff_effective > flag to avoid going through the soft-affinity balancing step in the > scheduler code. This is, in fact, why we also check whether hard is not > a strict subset of soft. As, if it is, there's no need to do anything > about soft, as honoring hard will automatically take care of that as > well. > >> Is that mainly >> to have a way to record preferences even when all preferred CPUs are >> offline, to be able to go back to the preferences once CPUs come back >> online? >> > That's another example/use case, yes. We want to record the user's > preference, whatever the status of the system (and of other aspects of > the configuration) is. > > But I'm not really sure I've answered... Have I? You did. My question really only was whether there are useful scenarios for proper-subset cases in both possible directions. >> Then a follow-on question is: Why do you use cpumask_all for soft >> affinity in the first of the two calls above? Is this to cover for >> the >> case where all CPUs in dom0_cpus would go offline? >> > Mmm... what else should I be using? I was thinking of dom0_cpus. > If dom0_nodes is in "strict" mode, > we want to control hard affinity only. So we set soft to the default, > which is "all". During operations, since hard is a subset of "all", > soft-affinity will be just ignored. Right - until such point that all (original) Dom0 CPUs have gone offline. Hence my 2nd question. > So I'm using "all" because soft-affinity is just "all", unless someone > sets it differently. How would "someone set it differently"? Aiui you can't control both affinities at the same time. > But I am again not sure that I fully understood and properly addressed > your question. :-( > > >>> + } >>> else >>> sched_set_affinity(unit, &cpumask_all, &cpumask_all); >> >> Hmm, you leave this alone. Wouldn't it be better to further >> generalize >> things, in case domain affinity was set already? I was referring to >> the mask calculated by sched_select_initial_cpu() also in this >> regard. >> And when I did suggest to re-use the result, I did mean this >> literally. >> > Technically, I think we can do that. Although, it's probably cumbersome > to do, without adding at least one cpumask on the stack, or reshuffle > the locking between sched_select_initial_cpu() and sched_init_vcpu(), > in a way that I (personally) don't find particularly pretty. Locking? sched_select_initial_cpu() calculates into a per-CPU variable, which I sincerely hope cannot be corrupted by another CPU. > Also, I don't think we gain much from doing that, as we probably still > need to have some special casing of dom0, for handling dom0_vcpus_pin. dom0_vcpus_pin is likely always going to require special casing, until such point where we drop support for it. > And again, soft and hard affinity should be set to what the user wants > and asks for. And if, for instance, he/she passes > dom0_nodes="1,strict", soft-affinity should just be all. If, e.g., we > set both hard and soft affinity to the CPUs of node 1, and if later > hard affinity is manually changed to "all", soft affinity will remain > to node 1, even if it was never asked for it to be that way, and the > user will need to change that explicitly as well. (Of course, it's not > particularly clever to boot with dom0_nodes="1,strict" and then change > dom0's vCPUs' hard affinity to node 0... but the user is free to do > that.) I can certainly accept this as justification for using "all" further up. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |