Xen project Mailing List

[Xen-devel] Introduce rt real-time scheduler for Xen

This serie of patches adds rt real-time scheduler to Xen. In summary, It supports: 1) Preemptive Global Earliest Deadline First scheduling policy by using a global RunQ for the scheduler; 2) Assign/display each VCPU's parameters of each domain; 3) Supports CPU Pool The design of this rt scheduler is as follows: This rt scheduler follows the Preemptive Global Earliest Deadline First (GEDF) theory in real-time field. Each VCPU can have a dedicated period and budget. While scheduled, a VCPU burns its budget. Each VCPU has its budget replenished at the beginning of each of its periods; Each VCPU discards its unused budget at the end of each of its periods. If a VCPU runs out of budget in a period, it has to wait until next period. The mechanism of how to burn a VCPU's budget depends on the server mechanism implemented for each VCPU. The mechanism of deciding the priority of VCPUs at each scheduling point is based on the Preemptive Global Earliest Deadline First scheduling scheme. Server mechanism: a VCPU is implemented as a deferrable server. When a VCPU has a task running on it, its budget is continuously burned; When a VCPU has no task but with budget left, its budget is preserved. Priority scheme: Global Earliest Deadline First (EDF). At any scheduling point, the VCPU with earliest deadline has highest priority. Queue scheme: A global runqueue for each CPU pool. The runqueue holds all runnable VCPUs. VCPUs in the runqueue are divided into two parts: with and without remaining budget. At each part, VCPUs are sorted based on GEDF priority scheme. Scheduling quanta: 1 ms; but accounting the budget is in microsecond. ----------------------------------------------------------------------------------------------------------------------------- One scenario to show the functionality of this rt scheduler is as follows: //list each vcpu's parameters of each domain in cpu pools using rt scheduler #xl sched-rt Cpupool Pool-0: sched=EDF Name ID VCPU Period Budget Domain-0 0 0 10 10 Domain-0 0 1 20 20 Domain-0 0 2 30 30 Domain-0 0 3 10 10 litmus1 1 0 10 4 litmus1 1 1 10 4 //set the parameters of the vcpu 1 of domain litmus1: # xl sched-rt -d litmus1 -v 1 -p 20 -b 10 //domain litmus1's vcpu 1's parameters are changed, display each VCPU's parameters separately: # xl sched-rt -d litmus1 Name ID VCPU Period Budget litmus1 1 0 10 4 litmus1 1 1 20 10 // list cpupool information xl cpupool-list Name CPUs Sched Active Domain count Pool-0 12 rt y 2 //create a cpupool test #xl cpupool-cpu-remove Pool-0 11 #xl cpupool-cpu-remove Pool-0 10 #xl cpupool-create name=\"test\" sched=\"credit\" #xl cpupool-cpu-add test 11 #xl cpupool-cpu-add test 10 #xl cpupool-list Name CPUs Sched Active Domain count Pool-0 10 rt y 2 test 2 credit y 0 //migrate litmus1 from cpupool Pool-0 to cpupool test. #xl cpupool-migrate litmus1 test //now litmus1 is in cpupool test # xl sched-credit Cpupool test: tslice=30ms ratelimit=1000us Name ID Weight Cap litmus1 1 256 0 ----------------------------------------------------------------------------------------------------------------------------- The differences between this new rt real-time scheduler and the sedf scheduler are as follows: 1) rt scheduler supports global EDF scheduling, while sedf only supports partitioned scheduling. With the support of vcpu mask, rt scheduler can also be used as partitioned scheduling by setting each VCPUâs cpumask to a specific cpu. 2) rt scheduler supports setting and getting each VCPUâs parameters of a domain. A domain can have multiple vcpus with different parameters, rt scheduler can let user get/set the parameters of each VCPU of a specific domain; (sedf scheduler does not support it now) 3) rt scheduler supports cpupool. 4) rt scheduler uses deferrable server to burn/replenish budget of a VCPU, while sedf uses constrant bandwidth server to burn/replenish budget of a VCPU. This is just two options of implementing a global EDF real-time scheduler and both optionsâ real-time performance have already been proved in academic. (Briefly speaking, the functionality that the *SEDF* scheduler plans to implement and improve in the future release has already been supported in this rt scheduler.) (Although itâs unnecessary to implement two server mechanisms, we can simply modify the two functions of burning and replenishing vcpusâ budget to incorporate the CBS server mechanism or other server mechanisms into this rt scheduler.) ----------------------------------------------------------------------------------------------------------------------------- TODO: 1) Improve the code of getting/setting each VCPUâs parameters. [easy] Right now, it create an array with LIBXL_XEN_LEGACY_MAX_VCPUS (i.e., 32) elements to bounce all VCPUsâ parameters of a domain between xen tool and xen to get all VCPUsâ parameters of a domain. It is unnecessary to have LIBXL_XEN_LEGACY_MAX_VCPUS elements for this array. The current work is to first get the exact number of VCPUs of a domain and then create an array with that exact number of elements to bounce between xen tool and xen. 2) Provide microsecond time precision in xl interface instead of millisecond time precision. [easy] Right now, rt scheduler let user to specify each VCPUâs parameters (period, budget) in millisecond (i.e., ms). In some real-time application, user may want to specify VCPUsâ parameters in microsecond (i.e., us). The next work is to let user specify VCPUsâ parameters in microsecond and count the time in microsecond (or nanosecond) in xen rt scheduler as well. 3) Add Xen trace into the rt scheduler. [easy] We will add a few xentrace tracepoints, like TRC_CSCHED2_RUNQ_POS in credit2 scheduler, in rt scheduler, to debug via tracing. 4) Method of improving the performance of rt scheduler [future work] VCPUs of the same domain may preempt each other based on the preemptive global EDF scheduling policy. This self-switch issue does not bring benefit to the domain but introduce more overhead. When this situation happens, we can simply promote the current running lower-priority VCPUâs priority and let it borrow budget from higher priority VCPUs to avoid such self-swtich issue. Timeline of implementing the TODOs: We plan to finish the TODO 1), 2) and 3) within 3-4 weeks (or earlier). Because TODO 4) will make the scheduling policy not pure GEDF, (people who wants the real GEDF may not be happy with this.) we look forward to hearing peopleâs opinions. ----------------------------------------------------------------------------------------------------------------------------- Special huge thanks to Dario Faggioli for his helpful and detailed comments on the preview version of this rt scheduler. :-) Any comment, question, and concerns are more than welcome! :-) Thank you very much! Meng [PATCH RFC v1 1/4] rt: Add rt scheduler to hypervisor [PATCH RFC v1 2/4] xl for rt scheduler [PATCH RFC v1 3/4] libxl for rt scheduler [PATCH RFC v1 4/4] libxc for rt scheduler ----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.