[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] deadlock in the credit2
On 14/10/2011 12:47, "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx> wrote: > 2011/10/14 Eunbyung Park <silverbottlep@xxxxxxxxx>: >> IMHO, it seems to be deadlock when changing dom0's weight in credit2 >> scheduler. >> >> when the sched_adjust() in schedule.c is called, it grabs the >> schedule_lock after pausing all of the vcpus >> >> and then, csched_dom_cntl in sched_credit2.c, it also grab the >> schedule_lock by using vcpu_schedule_lock_irq(). >> >> In the credit2, all of the percpu schedule_lock points out same runqueue >> lock if they belong to same runqueue. >> >> Eventually, all of vcpu are paused except for itself running the code, >> and it try to grab schedule_lock that was grabbed by itself. >> >> Am I right? If I was wrong, please tell me my misunderstanding. > > Hmm, I think you may have discovered the source of a bug that people > have been reporting but I haven't had time to look into yet. > > Keir, I think that lock in schedule.c around SCHED(adjust) must be > wrong. By the time we grab that lock, grabbing it will be completely > pointless. What are we going to be racing against? In any case, the > actual scheduler should be responsible for grabbing locks; there's no > reason that the scheduler can't grab whatever lock it needs inside > that function. I haven't done a deep analysis, but my initial > instinct is to just get rid of it. What do you think? Fine by me. The synchronisation in that function looks pretty fragile. It's probably outdated too. -- Keir >> if ( d == current->domain ) >> vcpu_schedule_lock_irq(current); >> >> It was very hard to understan for me..:) What does it exactly mean? > > You're asking what "current" means? "current" is a macro that always > resolves to the vcpu which is running on the current processor. > > sched_adjust() seems to be trying to avoid scheduling races in general > by pausing all vcpus before calling the per-scheduler function. But > if a VM is calling the op on itself, the vcpu making the hypercall > can't pause itself. So in that case (current->domain == d) will be > true, so sched_adjust() grab the schedule lock of that vm instead. > > But really all that locking should be handled in the scheduler > function, not by the generic code. It knows best what needs to be > locked when. > > -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |