[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 00/12] cpumask handling scalability improvements
>>> On 20.10.11 at 17:09, Keir Fraser <keir.xen@xxxxxxxxx> wrote: > On 20/10/2011 14:36, "Jan Beulich" <JBeulich@xxxxxxxx> wrote: > >> This patch set makes some first steps towards eliminating the old cpumask >> accessors, replacing them by such that don't require the full NR_CPUS >> bits to be allocated (which obviously can be pretty wasteful when >> NR_CPUS is high, but the actual number is low or moderate). >> >> 01: introduce and use nr_cpu_ids and nr_cpumask_bits >> 02: eliminate cpumask accessors referencing NR_CPUS >> 03: eliminate direct assignments of CPU masks >> 04: x86: allocate IRQ actions' cpu_eoi_map dynamically >> 05: allocate CPU sibling and core maps dynamically > > I'm not sure about this. We can save ~500 bytes per cpumask_t when > NR_CPUS=4096 and actual nr_cpus<64. But how many cpumask_t's do we typically > have dynamically allocated all at once? Let's say we waste 2kB per VCPU and > per IRQ, and we have a massive system with ~1k VCPUs and ~1k IRQs -- we'd > save ~4MB in that extreme case. But such a large system probably actually > will have a lot of CPUs. And also a lot of memory, such that 4MB is quite > insignificant. It's not only the memory savings, but the time savings in manipulating less space. > I suppose there is a second argument that it shrinks the containing > structures (struct domain, struct vcpu, struct irq_desc, ...) and maybe > helps reduce our order!=0 allocations? Yes - that's what made me start taking over these Linux bits. What I sent here just continues on that route. I was really hoping that we wouldn't leave this in a half baked state. > By the way, I think we could avoid the NR_CPUS copying overhead everywhere > by having the cpumask.h functions respect nr_cpu_ids, but continuing to > return NR_CPUS for sentinel value (e.g., end of loop; or no bit found)? This > would not need to change tonnes of code. It only gets part of the benefit > (reducing cpu time overhead) but is more palatable? That would be possible, but would again leave is in a somewhat incomplete state. (Note that I did leave NR_CPUS in the stop- machine logic). >> 06: allow efficient allocation of multiple CPU masks at once > > That is utterly hideous and for insignificant saving. I was afraid you would say that, and I'm not fully convinced either. But I wanted to give it a try to see how bad it is. The more significant saving here really comes from not allocating the CPU masks at all for unused irq_desc-s. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |