[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 00/12] cpumask handling scalability improvements
On 20/10/2011 14:36, "Jan Beulich" <JBeulich@xxxxxxxx> wrote: > This patch set makes some first steps towards eliminating the old cpumask > accessors, replacing them by such that don't require the full NR_CPUS > bits to be allocated (which obviously can be pretty wasteful when > NR_CPUS is high, but the actual number is low or moderate). > > 01: introduce and use nr_cpu_ids and nr_cpumask_bits > 02: eliminate cpumask accessors referencing NR_CPUS > 03: eliminate direct assignments of CPU masks > 04: x86: allocate IRQ actions' cpu_eoi_map dynamically > 05: allocate CPU sibling and core maps dynamically I'm not sure about this. We can save ~500 bytes per cpumask_t when NR_CPUS=4096 and actual nr_cpus<64. But how many cpumask_t's do we typically have dynamically allocated all at once? Let's say we waste 2kB per VCPU and per IRQ, and we have a massive system with ~1k VCPUs and ~1k IRQs -- we'd save ~4MB in that extreme case. But such a large system probably actually will have a lot of CPUs. And also a lot of memory, such that 4MB is quite insignificant. I suppose there is a second argument that it shrinks the containing structures (struct domain, struct vcpu, struct irq_desc, ...) and maybe helps reduce our order!=0 allocations? By the way, I think we could avoid the NR_CPUS copying overhead everywhere by having the cpumask.h functions respect nr_cpu_ids, but continuing to return NR_CPUS for sentinel value (e.g., end of loop; or no bit found)? This would not need to change tonnes of code. It only gets part of the benefit (reducing cpu time overhead) but is more palatable? > 06: allow efficient allocation of multiple CPU masks at once That is utterly hideous and for insignificant saving. > One reason I put the following ones together was to reduce the > differences between the disassembly of hypervisors built for 4095 > and 2047 CPUs, which I looked at to determine the places where > cpumask_t variables get copied without using cpumask_copy() (a > job where grep is of no help). Hence consider these patch optional, > but recommended. > > 07: cpufreq: allocate CPU masks dynamically > 08: x86/p2m: allocate CPU masks dynamically > 09: cpupools: allocate CPU masks dynamically > 10: credit: allocate CPU masks dynamically > 11: x86/hpet: allocate CPU masks dynamically > 12: cpumask <=> xenctl_cpumap: allocate CPU masks and byte maps dynamically Questionable. Any subsystem that allocates no more than a handful of cpumask_t's is possibly just as well left alone... I'm not dead set against them if we deicde that 01-05 are actually worth pursuing, however. -- Keir > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |