Xen project Mailing List

Re: [Xen-devel] Cpu pools discussion

From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>

Date: Tue, 28 Jul 2009 07:40:54 +0200

Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>

Delivery-date: Mon, 27 Jul 2009 22:41:23 -0700

Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=hxGquCvNYn29v7KlRpMpPe3+copE/F/qifBubKttf+9UjrHGuIUMLjAy cENQgwvaNlie4VFqwU6NUIKZvc99WeKJuvHTnqpANuZ1F9km0DR6Wdzv2 kkdI2KX4Mv64ZrROh1lAU+1PZdFD8+DVJ9JCXOrumKFm1d0tfdVHaCf6N XJeTTUDdqC1D0da32hdpr59o3Qj2Vla6dqe+Pnhj3V2tiP7hKWZrpkLQt xguUf20mCK7A1Z9NaomzY/SJ8wAAZ;

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

George Dunlap wrote: > Keir (and community), > > Any thoughts on Jeurgen Gross' patch on cpu pools? > > As a reminder, the idea is to allow "pools" of cpus that would have > separate schedulers. Physical cpus and domains can be moved from one > pool to another only by an explicit command. The main purpose Fujitsu > seems to have is to allow a simple machine "partitioning" that is more > robust than using simple affinity masks. Another potential advantage > would be the ability to use different schedulers for different > purposes. > > For my part, it seems like they should be OK. The main thing I don't > like is the ugliness related to continue_hypercall_on_cpu(), described > below. > > Jeurgen, could you remind us what were the advantages of pools in the > hypervisor, versus just having > affinity masks (with maybe sugar in the toolstack)? Sure. Our main reason for introducing pools was the weakness of the current scheduler(s) to schedule domains according to their weights while restricting the domains to a subset of the physical processors using pinning. I think it is virtually impossible to find a general solution for this problem without some sort of pooling (if somebody proves me being wrong here, I'm completely glad to take this "perfect" scheduler instead of pools :-) ). So while the reason for the pools was a lack of functionality in the first run, there are some more benefits: + possibility to use different schedulers for different domains on the same machine (do you remember the discussion with bcredit?). Zhigang has posted a request for this feature already. + less lock conflicts on huge machines with many processors + pools could be a good base for NUMA-aware scheduling policies > > Re the ugly part of the patch, relating to continue_hypercall_on_cpu(): > > Domains are assigned to a pool, so > if continue_hypercall_on_cpu() is called for a cpu not in the domain's > pool, you can't just run it normally. Jeurgen's solution (IIRC) was to > pause all domains in the other pool, temporarily move the cpu in > question to the calling domain's pool, finish the hypercall, then move > the cpu in question back to the other pool. > > Since there's a lot of antecedents in that, let's take an example: > > Two pools; Pool A has cpus 0 and 1, pool B has cpus 2 and 3. > > Domain 0 is running in pool A, domain 1 is running in pool B. > > Domain 0 calls "continue_hypercall_on_cpu()" for cpu 2. > > Cpu 2 is in pool B, so Jeurgen's patch: > * Pauses domain 1 > * Moves cpu 2 to pool A > * Finishes the hypercall > * Moves cpu 2 back to pool B > * Unpauses domain 1 > > That seemed a bit ugly to me, but I'm not familiar enough with the use > cases or the code to know if there's a cleaner solution. Some thoughts on this topic: The continue_hypercall_on_cpu() function is needed on x86 for loading new microcode into the processor. The source buffer of the new microcode is located in dom0-memory so dom0 has to run on the physical processor the new code is loaded into (otherwise it wouldn't be accessible). We could avoid the complete continue_hypercall_on_cpu() stuff if the microcode would be copied into a hypervisor buffer and use on_selected_cpus() instead. Other users (cpu hotplug and acpi_enter_sleep) would have to switch to other solutions as well. BTW: continue_hypercall_on_cpu() exists on x86 only and it isn't really much better than my usage of it: - remember old pinning state of current vcpu - pin it temporarily to the cpu it should continue on - continue the hypercall - remove temporary pinning - re-establish old pinning (if any) Pretty much the same as my solution above ;-) So I would suggest to eliminate continue_hypercall_on_cpu() completely if you are feeling uneasy with my solution. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@xxxxxxxxxxxxxx Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.