[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PROPOSAL] Event channel for SMP-VMs: per-vCPU or per-OS?



On 28/10/2013 15:26, Luwei Cheng wrote:
> This following idea was first discussed with George Dunlap, David Vrabel 
> and Wei Liu in XenDevSummit13. Many thanks for their encouragement to 
> post this idea to the community for a wider discussion.
> 
> [Current Design]
> Each event channel is associated with only “one” notified vCPU: one-to-one.
> 
> [Problem]
> Some events are per-vCPU (such as local timer interrupts) while some others 
> are per-OS (such as I/O interrupts: network and disk). 
> For SMP-VMs, it is possible that when one vCPU is waiting in the scheduling 
> queue, another vCPU is running. So, if the I/O events can be dynamically 
> routed to the running vCPU, the events can be processed quickly, without 
> suffering from VM scheduling delays (tens of milliseconds). On the other 
> hand, no reschedule operations are introduced.
> 
> Though users can set IRQ affinity in the guest OS, the current 
> implementation forces to bind the IRQ to the first vCPU of the 
> affinity mask [events.c: set_affinity_irq].
> If the hypervisor delivers the event to a different vCPU, the event 
> will get lost because the guest OS has masked out this event in all 
> non-notified vCPUs [events.c: bind_evtchn_to_cpu].
> 
> [New Design]
> For per-OS event channel, add “vCPU affinity” support: one-to-many.
> The “affinity” should be consistent with the ‘/proc/irq/#/smp_affinity’
> of the 
> guest OS and users can change the mapping at runtime. But by default, 
> all vCPUs should be enabled to serve I/O.
> 
> When such flexibility is enabled, I/O balancing among vCPUs can be 
> offloaded to the hypervisor. “irqbalance” is designed for physical 
> SMP systems, not virtual SMP systems.

It's an interesting idea but I'm not sure how useful it will be in
practise as often work is deferred to threads in the guest rather than
done directly in the interrupt handler.

I don't see any way this could be implemented using the 2-level ABI.

With the FIFO ABI, queues cannot move between VCPUs without some
additional locking (dequeuing an event is only safe with a single
consumer) but it may be possible (when an event is set pending) for Xen
to pick a queue from a set of queues, instead of always using the same
queue.

I don't think this would result in balanced I/O between VCPUs, but the
opposite -- events would crowd onto the few VCPUs that are currently
running.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.