[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler

To: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
From: Kun Suo <ksuo@xxxxxxxx>
Date: Mon, 26 Oct 2015 22:30:58 +0000
Accept-language: en-US
Cc: Yong Zhao <yzhao@xxxxxxxx>, Jia Rao <jrao@xxxxxxxx>
Delivery-date: Tue, 27 Oct 2015 07:13:06 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Thread-index: AQHRED38HHcmIoORHUiTGRg1ZZoCxA==
Thread-topic: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler

Hi all,

The BOOST mechanism in Xen credit scheduler is designed to prioritize VM which has I/O-intensive application to handle the I/O request in time. However, this does not always work as expected.

(1) Problem description

--------------------------

Suppose two VMs(named VM-I/O and VM-CPU) both have one virtual CPU and they are pinned to the same physical CPU. An I/O-intensive application(e.g. Netperf) runs in the VM-I/O and a CPU-intensive application(e.g. Loop) runs in the VM-CPU. When a client is sending I/O requests to VM-I/O, its vCPU cannot become BOOST state but obtains very little CPU cycles(less than 1% in Xen 4.6). Both the throughput and latency are very terrible.

(2) Problem analysis

--------------------------

This problem is due to the wake mechanism in Xen and CPU-intensive workload will be waked and boosted by mistake.

Suppose the vCPU of VM-CPU is running and an I/O request comes, the current vCPU(vCPU of VM-CPU) will be marked as _VPF_migrating.

static inline void __runq_tickle(unsigned int cpu, struct csched_vcpu *new)

{

...

if ( new_idlers_empty && new->pri > cur->pri )

{

SCHED_STAT_CRANK(tickle_idlers_none);

SCHED_VCPU_STAT_CRANK(cur, kicked_away);

SCHED_VCPU_STAT_CRANK(cur, migrate_r);

SCHED_STAT_CRANK(migrate_kicked_away);

set_bit(_VPF_migrating, &cur->vcpu->pause_flags);

__cpumask_set_cpu(cpu, &mask);

}

next time when the schedule happens and the prev is the vCPU of VM-CPU, the context_saved(vcpu) will be executed. Because the vCPU has been marked as _VPF_migrating and it will then be waked up.

void context_saved(struct vcpu *prev)

{

...

if ( unlikely(test_bit(_VPF_migrating, &prev->pause_flags)) )

vcpu_migrate(prev);

}

Once the state of vCPU of VM-CPU is UNDER, it will be changed into BOOST state which is designed originally for I/O-intensive vCPU. If this happen, even though the vCPU of VM-I/O becomes BOOST, it cannot get the physical CPU immediately but wait until the vCPU of VM-CPU is scheduled out. That will harm the I/O performance significantly.

(3) Our Test results

--------------------------

Hypervisor: Xen 4.6

Dom 0 & Dom U: Linux 3.18

Client: Linux 3.18

Network: 1 Gigabit Ethernet

Throughput:

Only VM-I/O: 941 Mbps

co-Run VM-I/O and VM-CPU: 32 Mbps

Latency:

Only VM-I/O: 78 usec

co-Run VM-I/O and VM-CPU: 109093 usec

This bug has been there from Xen 4.2 to Xen 4.6.

Thanks.

Reported by Tony Suo and Yong Zhao from UCCS

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler
  - From: Jia Rao

Prev by Date: Re: [Xen-devel] ovmf fails to build in stagin-4.6
Next by Date: Re: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler
Previous by thread: [Xen-devel] [ovmf baseline-only test] 38214: all pass
Next by thread: Re: [Xen-devel] [BUG] mistakenly wake in Xen's credit scheduler
Index(es):
- Date
- Thread