[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Cpu pools discussion
Keir Fraser wrote: > > > On 29/07/2009 09:52, "Juergen Gross" <juergen.gross@xxxxxxxxxxxxxx> wrote: > >>>> I can add an explicit check not to unassign borrowed cpus, if you like. >>> Your new interface ought to be responsible for its own synchronisation >>> needs. And if it's not you should implement the appropriate assertions >>> regarding e.g., spin_is_locked(), plus a code comment. It's simple >>> negligence to do neither. >> You are right. >> I will add a check to ensure borrowed cpus are not allowed to change the >> pool. > > A couple more comments. > > It is not safe to domain_pause() while you hold locks. It can deadlock, as > domain_pause() waits for the domain to be descheduled, but it could be > spinning on a lock you hold. Also it looks like a domain can be moved away > from a pool while the pool is paused, and then you would leak a pause > refcount. > > Secondly, I think that the cpupool_borrow/return calls should be embedded > within vcpu_{lock,unlock,locked_change}_affinity(); also I see no need to > have cpupool_return_cpu() return anything as you should be able to make a > decision to move onto another CPU on the next scheduling round anyway (which > can always be forced by setting SCHEDULE_SOFTIRQ). > > Really I dislike this patch greatly, as you can tell. ;-) The patchset as a > whole is *ginormous*, the Xen patch by itself is pretty big and complicated > and I believe full of races and deadlocks. I've just picked up on a few > obvious ones from a very brief read. The main problems you mention here are related to the cpupool_borrow stuff, which is the main objection of George, too (its not my favourite part of the patch, too). Would you feel better if I'd try to eliminate the reason for cpupool_borrow? This function is needed only for continue_hypercall_on_cpu outside of the current pool. I think it should be possible to replace those by on_selected_cpus with less impact on the whole system. I tried not to change any interfaces which are not directly related to the pools in the first run. If the result of this approach forces you to reject the patch, I would be happy to change it. I agree with you it would be better not to need that borrow stuff, but I don't know whether you would like the continue_hypercall_on_cpu elimination more (or which solution would cause less pain). The next step after that would be to split up the xen patch into logical pieces. I would suggest to change the scheduler internals in a separate patch (mainly the elimination of the local variables) to make the functional changes required for the pools more obvious. This should reduce the pure pool related patch by a factor of 2. Regarding races: I tested the "normal" pool interfaces (cpu add/remove, domain create/destroy/move) rather intensive (multiple concurrent scripts ran for several hours). The cpu borrow stuff was NOT tested very much. There are 3 use cases for this interface: - cpu microcode loading is running at system boot (this was my favourite test case) - enter deep sleep only continues on cpu 0, which I removed only occasionally from pool 0 - I don't think I could test cpu hotplug... Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@xxxxxxxxxxxxxx Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |