[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Introduce rtds real-time scheduler for Xen
Hi Dario, I think I fixed the bug and could you please test it again. :-) I will comment the error for the detailed explanation. 2014-09-17 10:15 GMT-04:00 Dario Faggioli <dario.faggioli@xxxxxxxxxx>: > On dom, 2014-09-14 at 17:37 -0400, Meng Xu wrote: >> This serie of patches adds rtds real-time scheduler to Xen. >> > I gave this series some testing, and the behavior of the scheduler is as > expected, so again, Meng and Sisu, good work. > > While doing it, I've also put the series in this git repo/branch: > > git://xenbits.xen.org/people/dariof/xen.git sched/rt/rtds-v3 > > http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/sched/rt/rtds-v3 > > > There are a couple of issue, though, one minor and one serious, which > I'd like you to fix, if possible, before freeze date. More info below. > >> //list VCPUs' parameters of each domain in cpu pools using rtds scheduler >> #xl sched-rtds >> Cpupool Pool-0: sched=EDF >> Name ID Period Budget >> Domain-0 0 10000 4000 >> vm1 1 10000 4000 >> > So, when I boot Xen with sched=rtds, issueing this command (`xl > sched-rtds') produces a lot of printk on the serial console, basically > outputting the dump of the scheduler information. > > I guess there is one call to rt_sched_dump() (or whatever that was) left > somewhere. Could you please check? > > This is not a serious issue, but since you're resending anyway... > >> //create a cpupool test >> #xl cpupool-cpu-remove Pool-0 3 >> #xl cpupool-cpu-remove Pool-0 2 >> #xl cpupool-create name=\"test\" sched=\"rtds\" >> #xl cpupool-cpu-add test 3 >> #xl cpupool-cpu-add test 2 >> #xl cpupool-list >> Name CPUs Sched Active Domain count >> Pool-0 2 rtds y 2 >> test 2 rtds y 0 >> > This works for me too. > > Booting with sched=credit, creating an rtds cpupool and migruting > domains there also works here too. > > However, booting with sched=rtds, and issuing the following commands: > # xl cpupool-cpu-remove Pool-0 20 > # xl cpupool-cpu-remove Pool-0 21 > # xl cpupool-create /etc/xen/be-cpupool > > Where /etc/xen/be-cpupool looks like this: > name = "be" > sched = "credit" > cpus = ["20", "21"] > sched="credit" > > Makes Xen *CRASH* with the following trace: > > (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Not tainted ]---- > (XEN) ----[ Xen-4.5-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 20 > (XEN) CPU: 21 > (XEN) RIP: e008:[<ffff82d08012bb1e>]RIP: e008:[<ffff82d08012bb1e>] > check_lock+0x1e/0x3b check_lock+0x1e/0x3b > (XEN) RFLAGS: 0000000000010002 > (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor > (XEN) CONTEXT: hypervisor > (XEN) rax: 0000000000000001 rbx: 0000000000000000 rcx: 0000000000000001 > (XEN) rax: 0000000000000001 rbx: 0000000000000000 rcx: 0000000000000001 > (XEN) rdx: 0000000000000001 rsi: ffff830917fc4c80 rdi: 0000000000000004 > (XEN) rdx: 0000000000000001 rsi: ffff830917fc2c80 rdi: 0000000000000004 > (XEN) rbp: ffff830917fb7e08 rsp: ffff830917fb7e08 r8: 0000001d52034e80 > (XEN) rbp: ffff830917fafe08 rsp: ffff830917fafe08 r8: 0000000000000000 > (XEN) r9: ffff830828a47978 r10: 00000000deadbeef r11: 0000000000000246 > (XEN) r9: ffff830917fe3ea8 r10: 00000000deadbeef r11: 0000000000000246 > (XEN) r12: 0000001d51efbd89 r13: ffff82d080320de0 r14: 0000000000000000 > (XEN) r12: 0000001d51efbb99 r13: ffff82d080320de0 r14: 0000000000000000 > (XEN) r15: 0000000000000014 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) r15: 0000000000000015 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 00000000cf08f000 cr2: 0000000000000004 > (XEN) cr3: 00000000cf08f000 cr2: 0000000000000004 > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff830917fb7e08: > (XEN) Xen stack trace from rsp=ffff830917fafe08: > (XEN) ffff830917fb7e20 ffff830917fafe20 ffff82d08012bbc4 ffff82d08012bbc4 > ffff8300cf12d000 ffff8300cf12c000 ffff830917fb7eb0 ffff830917fafeb0 > (XEN) > (XEN) ffff82d080128175 ffff82d080128175 ffff830917fb7e40 ffff830917fafe40 > ffff82d0801879ef ffff82d0801879ef 0000001400fb7e60 0000001500fafe60 > (XEN) > (XEN) ffff830917fc4060 ffff830917fc2060 ffff830917fb7e60 ffff830917fafe60 > ffff830917fc4200 ffff830917fc2200 ffff830917fb7eb0 ffff830917fafeb0 > (XEN) > (XEN) ffff82d08012e983 ffff82d08012e983 ffff830917fb7ef0 ffff830917fafef0 > ffff82d0801aa941 ffff82d0801aa941 ffff830917fb7e90 ffff830917fafe90 > (XEN) > (XEN) ffff82d0802f8980 ffff82d0802f8a00 ffff82d0802f7f80 ffff82d0802f7f80 > ffffffffffffffff ffffffffffffffff ffff830917fb0000 ffff830917fa8000 > (XEN) > (XEN) 00000000000f4240 00000000000f4240 ffff830917fb7ee0 ffff830917fafee0 > ffff82d08012b539 ffff82d08012b539 ffff830917fb0000 ffff830917fa8000 > (XEN) > (XEN) 0000001d51dfd294 0000001d51dfcb8f ffff8300cf12d000 ffff8300cf12c000 > ffff83092d6a0990 ffff83092d6a0990 ffff830917fb7ef0 ffff830917fafef0 > (XEN) > (XEN) ffff82d08012b591 ffff82d08012b591 ffff830917fb7f10 ffff830917faff10 > ffff82d080160425 ffff82d080160425 ffff82d08012b591 ffff82d08012b591 > (XEN) > (XEN) ffff8300cf12d000 ffff8300cf12c000 ffff830917fb7e10 ffff830917fafe10 > 0000000000000000 0000000000000000 ffff88003a0cbfd8 ffff88003a0edfd8 > (XEN) > (XEN) ffff88003a0cbfd8 ffff88003a0edfd8 0000000000000007 0000000000000014 > ffff88003a0cbec0 ffff88003a0edec0 0000000000000000 0000000000000000 > (XEN) > (XEN) 0000000000000246 0000000000000246 0000001c6462ff88 0000001c986bebc8 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > (XEN) > (XEN) 0000000000000000 0000000000000000 ffffffff810013aa ffffffff810013aa > ffffffff81c31160 ffffffff81c31160 00000000deadbeef 00000000deadbeef > (XEN) > (XEN) 00000000deadbeef 00000000deadbeef 0000010000000000 0000010000000000 > ffffffff810013aa ffffffff810013aa 000000000000e033 000000000000e033 > (XEN) > (XEN) 0000000000000246 0000000000000246 ffff88003a0cbea8 ffff88003a0edea8 > 000000000000e02b 000000000000e02b c2c2c2c2c2c2c2c2 c2c2c2c2c2c2c2c2 > (XEN) > (XEN) c2c2c2c2c2c2c2c2 c2c2c2c2c2c2c2c2 c2c2c2c2c2c2c2c2 c2c2c2c2c2c2c2c2 > c2c2c2c2c2c2c2c2 c2c2c2c2c2c2c2c2 c2c2c2c200000014 c2c2c2c200000015 > (XEN) > (XEN) ffff8300cf12d000 ffff8300cf12c000 0000003897ca3280 0000003897ca1280 > c2c2c2c2c2c2c2c2 c2c2c2c2c2c2c2c2 > (XEN) > (XEN) Xen call trace: > (XEN) Xen call trace: > (XEN) [<ffff82d08012bb1e>] check_lock+0x1e/0x3b > (XEN) [<ffff82d08012bb1e>] check_lock+0x1e/0x3b > (XEN) [<ffff82d08012bbc4>] _spin_lock_irq+0x1b/0x6c > (XEN) [<ffff82d08012bbc4>] _spin_lock_irq+0x1b/0x6c > (XEN) [<ffff82d080128175>] schedule+0xc0/0x5da > (XEN) [<ffff82d080128175>] schedule+0xc0/0x5da > (XEN) [<ffff82d08012b539>] __do_softirq+0x81/0x8c > (XEN) [<ffff82d08012b539>] __do_softirq+0x81/0x8c > (XEN) [<ffff82d08012b591>] do_softirq+0x13/0x15 > (XEN) [<ffff82d08012b591>] do_softirq+0x13/0x15 > (XEN) [<ffff82d080160425>] idle_loop+0x5e/0x6e > (XEN) [<ffff82d080160425>] idle_loop+0x5e/0x6e > (XEN) > (XEN) > (XEN) Pagetable walk from 0000000000000004: > (XEN) Pagetable walk from 0000000000000004: > (XEN) L4[0x000] = 000000092d6a4063 ffffffffffffffff > (XEN) L4[0x000] = 000000092d6a4063 ffffffffffffffff > (XEN) L3[0x000] = 000000092d6a3063 ffffffffffffffff > (XEN) L3[0x000] = 000000092d6a3063 ffffffffffffffff > (XEN) L2[0x000] = 000000092d6a2063 ffffffffffffffff > (XEN) L2[0x000] = 000000092d6a2063 ffffffffffffffff > (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff > (XEN) L1[0x000] = 0000000000000000 ffffffffffffffff > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 20: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=0000] > (XEN) Faulting linear address: 0000000000000004 > (XEN) **************************************** > (XEN) > (XEN) Manual reset required ('noreboot' specified) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 21: > (XEN) FATAL PAGE FAULT > (XEN) [error_code=0000] > (XEN) Faulting linear address: 0000000000000004 > (XEN) **************************************** > (XEN) > (XEN) Manual reset required ('noreboot' specified) > > Can you please check, try to reproduce, and fix ASAP? This is a lock/null pointer issue. When we remove a cpu from a cpupool, rt_vcpu_remove() @xen/common/sched_rt.c will be called. In that function, I just set the schedule_lock of schedule_data to NULL in version 3' patch, which causes null pointer bug when schedule() @ xen/common/schedule.c tries to access the lock by "lock = pcpu_schedule_lock_irq(cpu);". I fixed this by changing the rt_vcpu_remove() @ sched_rt.c. The changed rt_vcpu_remove() function is at LINE 436 at https://github.com/PennPanda/xenproject/blob/rtxen-v1.90-patch-v4/xen/common/sched_rt.c I also tested the scenario you gave as above on my 12cores machine. No bugs now After booting up the system with rtds, I did: # xl cpupool-cpu-remove Pool-0 11 # xl cpupool-cpu-remove Pool-0 10 # xl cpupool-create /etc/xen/be-cpupool Where /etc/xen/be-cpupool looks like this: name = "be" sched = "credit" cpus = ["10", "11"] sched="credit" Could you please pull the code and test on your machine to confirm it works on your 24 cores machine as well? The latest code is at https://github.com/PennPanda/xenproject , branch: rtxen-v1.90-patch-v4 (Of course, I can send a next version of patch with this fix, but I hope to make sure this is working on your machine so that people won't get too many patches. Then I "hope" the next patch set could get committed. :-)) Once I receive the confirmation that the bug does not exist on your machine any more, I will release the next version with all other comments fixed. > > This, IMO, does not alter the prognosis, wrt 4.5 inclusion, at least not > per-se. In fact, it's ok for some bugs to be around at feature freeze > time, for the features we said we want. What we need to know is that > we're likely going to be able to fix them without making the release > slip. > > So you should really either fix this, or provide here enough insights, > to convince people you're on the way to that. :-) I think I figured out what happened, so the bug should have been solved. :-) Thanks, Meng ----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |