[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fast inter-VM signaling using monitor/mwait



On May 6, 2009, at 2:22 AM, Tian, Kevin wrote:

From: Michael Abd-El-Malek [mailto:mabdelmalek@xxxxxxx]
Sent: 2009年5月5日 22:29

On Apr 26, 2009, at 9:04 AM, Tian, Kevin wrote:

From: Michael Abd-El-Malek [mailto:mabdelmalek@xxxxxxx]
Sent: 2009年4月24日 5:48

On Apr 21, 2009, at 5:01 AM, Tian, Kevin wrote:

From: Ian Pratt
Sent: 2009年4月21日 11:19

The mwait instruction is privileged.  So I added a new hypercall
that
wraps access to the mwait instruction.  Thus, my code has a Xen
component (the new hypercall) and a guest kernel component
(code for
executing the hypercall and for turning off/on the timer
interrupts
around the hypercall).  For this code to be merged into Xen, it
would
need to add security checks and check whether the
processor supports
such a feature.

I seem to recall that some newer CPUs have an mwait
instruction accessible from ring3, using a different opcode --
you might want to check this out.

How do you deal with atomicity of the monitor and mwait? i.e.
how do you stop the hypervisor pre-empting the VM and using
monitor for its own purposes or letting another guest use it?

That's a true concern. To use monitor/mwait sanely, software is
required
to not add voluntary context switch in between, however to
ensure that
atomicity at hypercall level, I'm not sure about overall efficiency
when
multiple VMs are all active...

I'm executing the montior and mwait instructions together in the
hypercall.  The hypercall also takes an argument specifying the old
value of the memory location.  When the mwait instruction
returns, the
hypervisor can check and handle any interrupts.  I
currently return a
continuation so that the mwait hypercall is rexecuted at the end of
handling interrupts.  I haven't really thought about what if the VM
gets scheduled out.  These are the kinds of issues that I'd like to
fix if the community wants to add this hypercall.  For my

Maybe the reverse that you need consider those issues to persuade
the community or else it's like a very limited usage in real world.
This
is something to hold the cpu exclusively with unknown time, unless
you also ensure producer, which writes to monitored address, not
being scheduled out too, which then further limits the
actual benefit.

Interrupts will cause the mwait instruction to return.  So the same
periodic timer interrupts that are used for VM scheduling will
continue to be useful.  The CPU is not held exclusively for unbounded
time.

In Xen actual vcpu scheduling happens at the point before resuming
back to VM, instead of in timer interrupt ISR. So as long as your
monitor/mwait loop in hypercall doesn't exit before update is observed,
scheduling won't happen.

I'm not an expert on Xen scheduling, so please correct my following understanding. For the credit scheduler, csched_tick sets the next timer interrupt. So after the mwait hypercall executes the mwait instruction and is waiting for a memory write, I observe the timer interrupt eventually causing the mwait instruction to return. The mwait hypercall can then run the scheduler.

benchmarking
purposes, I'm not worrying about this :)

Have you thought about HVM guests as well as PV?


For HVM guest, both vmexit and vmentry clears any address range
monitoring in effect and thus that won't work.

I imagine this would cause the mwait instruction to execute before a
write occurs to the memory address?  If so, the guest OS can check
this (by comparing the memory address's value to the previous saved
value), and reexecute the mwait hypercall.  Users of mwait already
have to check whether their terminating condition has
occurred, since
interrupts cause mwait to return.

yes, then why do you need monitor/mwait, compared to a simple loop
checking data directly? :-)

The simple spin-poll loop prevents the core from going into a low-
energy mode.  My motivation in using monitor/mwait is to get the
latency of spin-poll but with the energy efficiency of Xen events
(i.e., the CPU can go to sleep if the VM is waiting for a signal).

That's obvious a wrong model to go. There could be other runnable
threads with VM. Here it's not "if VM is waiting for a singal", instead
it's just "if one thread in VM is waiting for a signal".

Yes, the model is "if the VM's CPU is idle". In other words, if there are runnable threads, I don't need to interrupt the CPU. The reason is that I'm treating the VM as a "server VM" -- so if it's serving other requests, there's no need to interrupt it; it will check for new requests after finishing with the current request. I only want to signal the VM in case it's idle.

Cheers,
Mike
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.