Xen project Mailing List

Re: [Xen-devel] [PATCH RFC v1 42/74] sched/null: skip vCPUs on the waitqueue that are blocked

To: Jan Beulich <JBeulich@xxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, <wei.liu2@xxxxxxxxxx>

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Mon, 8 Jan 2018 11:12:07 +0000

Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Dario Faggioli <raistlin@xxxxxxxx>

Delivery-date: Mon, 08 Jan 2018 11:12:17 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01/08/2018 10:37 AM, Jan Beulich wrote: >>>> On 04.01.18 at 14:05, <wei.liu2@xxxxxxxxxx> wrote: >> From: Roger Pau Monne <roger.pau@xxxxxxxxxx> >> >> Avoid scheduling vCPUs that are blocked, there's no point in assigning >> them to a pCPU because they are not going to run anyway. >> >> Since blocked vCPUs are not assigned to pCPUs after this change, force >> a rescheduling when a vCPU is brought up if it's on the waitqueue. >> Also when scheduling try to pick a vCPU from the runqueue if the pCPU >> is running idle. > > I don't think the description adequately describes the changes, > perhaps (in part) because ... > >> Changes since v1: >> - Force a rescheduling when a vCPU is brought up. >> - Try to pick a vCPU from the runqueue if running the idle vCPU. > > ... it wasn't updated after making these adjustments. > >> --- a/xen/common/sched_null.c >> +++ b/xen/common/sched_null.c >> @@ -574,6 +574,8 @@ static void null_vcpu_wake(const struct scheduler *ops, >> struct vcpu *v) >> { >> /* Not exactly "on runq", but close enough for reusing the counter >> */ >> SCHED_STAT_CRANK(vcpu_wake_onrunq); >> + /* Force a rescheduling in case some CPU is idle can pick this vCPU >> */ >> + cpumask_raise_softirq(&cpu_online_map, SCHEDULE_SOFTIRQ); >> return; >> } > > I don't understand: Isn't the null scheduler not moving around > vCPU-s at all? At least that's what the comment at the top of the > file says, unless I'm mis-interpreting it. If so, how can "some CPU > (...) pick this vCPU"? There's no current way to prevent a user from adding more vcpus to a pool than there are pcpus (if nothing else, by creating a new VM in a given pool), or from taking pcpus from a pool in which #vcpus >= #pcpus. The null scheduler deals with this by having a queue of "unassigned" vcpus that are waiting for a free pcpu. When a pcpu becomes available, it will do the assignment. When a pcpu that has a vcpu is assigned is removed from the pool, that vcpu is assigned to a different pcpu if one is available; if not, it is put on the list. In the case of shim mode, this also seems to happen whenever curvcpus < maxvcpus: The L1 hypervisor (shim) only sees curvcpus cpus on which to schedule L2 vcpus, but the L2 guest has maxvcpus vcpus to schedule, of which (maxvcpus-curvcpus) are marked 'down'. In this case, it also seems that the null scheduler sometimes schedules a "down" vcpu when there are "up" vcpus on the list; meaning that the "up" vcpus are never scheduled. (This is just my understanding from conversations with Roger; I haven't actually looked at the code to verify a number of the statements in the previous paragraph.) -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.