[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [VTD][patch 0/5] HVM device assignment using vt-d



 

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx] 
> Sent: Sunday, June 03, 2007 4:30 PM
> To: Guy Zana; Kay, Allen M; Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-devel] [VTD][patch 0/5] HVM device 
> assignment using vt-d
> 
> The sequence of interrupt injection doesn't matter actually, 
> since you can't wait and inject to next domain only after 
> previous one in the chain doesn't handle it which is very low 
> efficient.
> 
> To me the unhandled irq issue (as 99900 out of 100000) is inevitable. 
> Say irq sharing among 2 HVM domains, with one assigned a high 
> rate PCI device like NIC and the other assigned with a low 
> rate PCI device like UHCI, it's likely to have over 100000 
> interrupts from NIC with UHCI silent in given period. Since, 
> from Xen point of view, there's no way to know which HVM 
> guest owns given interrupt instance, same amount of 
> interrupts will be injected into both HVM domains. 

Sharing an irq between two HVMs is surely not something we would want to handle 
right now.

> 
> We may force "noirqdebug", however that may not apply to all 
> linux version and other OSes from HVM side.
> 

Maybe we would like to add a PV dummy-driver to the HVM, that will register on 
that IRQ and solve the 99,900:100,000 problem?
I think that HVM assert/deassert state should be set only after giving dom0 a 
chance to handle the IRQ is more robust. 


> Actually there're more tricky things to consider for irq 
> sharing among domains. For example:
>       - Driver in one HVM domain may leave device in 
> interrupt assertion status while having related virtual wire 
> always masked (like an unclean driver unload). 
> 
>       - When OS first mask PIC entry and then unmask IOAPIC 
> entry one interrupt may occur in the middle and IOAPIC 
> doesn't pend when masked). So that pending indicator in PIC is missed.
> 
>       Such rare cases can block the other domain sharing same 
> irq, once occurring unfortunately. This breaks the isolation 
> between domains heavily, which is common issue whatever 
> approach we use to share irq. 
> 
>       Maybe better way is to use MSI instead and we may then 
> avoid above irq share issue from management tool side. For 
> example, avoid sharing devices with same irq among domains 
> when MSI is not able to use...

We can also disable the driver in dom0 :-)

> 
> Thanks,
> Kevin
> 
> >-----Original Message-----
> >From: Guy Zana [mailto:guy@xxxxxxxxxxxx]
> >Sent: 2007年6月3日 17:59
> >To: Kay, Allen M; Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx
> >Cc: Tian, Kevin
> >Subject: RE: [Xen-devel] [VTD][patch 0/5] HVM device 
> assignment using 
> >vt-d
> >
> >Sort of... Our method might doubles the number of interrupts if both 
> >devices are connected to the same pin, but since all devices are OR 
> >wired, you might even "save" *physical* interrupts from 
> happening -> I 
> >guess that we'll get a decisive answer only after performing some 
> >profiling.
> >
> >Our method will not work "out of the box" if you're trying to use it 
> >when sharing a pin between dom0 and an HVM.
> >Consider the following scenario:
> >
> >HVM:
> >               _____________________
> >        ____|
> >|___________________
> >
> >Dom0:
> >
> >____________________________________
> >        __________|
> >
> >Phys Line:
> >                __________________________________________
> >        ____|
> >
> >
> >        A    B         C                       D
> >
> >
> >In point B you changed the polarity. In point C and D you won't be 
> >getting any interrupts since of the polarity-change, and the device 
> >that is allocated for dom0 will keep its line asserted until 
> the dom0 
> >driver will handle the interrupt, but it won't get a chance 
> to do so, 
> >moreover, the hvm vline will still be kept asserted.
> >
> >We are currently modeling the problem, it seems that it's a 
> complicated 
> >concept, regardless of changing-polarity. For instance, an 
> HVM with a 
> >Linux OS will die if 99,900 interrupts out of 100,000 are 
> not handled.
> >
> >From a logical POV, the aforementioned race is solved like 
> this: we can 
> >hold a virtual assertion line for _dom0_ (which will be 
> updated by the 
> >arrival of interrupts as a result from change-polarity) and 
> concatenate 
> >the HVM's ISR chain with dom0's ISR chain, and dom0 must be 
> the first 
> >to try handle the interrupt (because of the 99,000 to 
> 100,000 problem), 
> >I guess that pass-through shared interrupts probably should 
> be handled 
> >as the last (default) function in dom0's ISR chain.
> >
> >How do you plan to provide interrupts sharing with your 
> method exactly?
> >Please provide your thoughts.
> >
> >Thanks,
> >Guy.
> >
> >> -----Original Message-----
> >> From: Kay, Allen M [mailto:allen.m.kay@xxxxxxxxx]
> >> Sent: Sunday, June 03, 2007 11:29 AM
> >> To: Guy Zana; Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx
> >> Subject: RE: [Xen-devel] [VTD][patch 0/5] HVM device 
> assignment using 
> >> vt-d
> >>
> >> Base on my understanding of the Neocleus' passthrough 
> patch, it seems 
> >> all devices sharing that interrupt will get the double number of 
> >> interrupts.  This means if a interrupt is shared between a 
> NIC device 
> >> used by a HVM guest and a SATA device used by dom0, the 
> SATA driver 
> >> in dom0 will also get twice the number of interrupts.  Am 
> I correct?
> >>
> >> Allen
> >>
> >> >-----Original Message-----
> >> >From: Guy Zana [mailto:guy@xxxxxxxxxxxx]
> >> >Sent: Wednesday, May 30, 2007 11:05 PM
> >> >To: Keir Fraser; Kay, Allen M; xen-devel@xxxxxxxxxxxxxxxxxxx
> >> >Subject: RE: [Xen-devel] [VTD][patch 0/5] HVM device
> >> assignment using
> >> >vt-d
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >> >> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On 
> Behalf Of Keir 
> >> >> Fraser
> >> >> Sent: Wednesday, May 30, 2007 10:56 PM
> >> >> To: Kay, Allen M; xen-devel@xxxxxxxxxxxxxxxxxxx
> >> >> Subject: Re: [Xen-devel] [VTD][patch 0/5] HVM device
> >> assignment using
> >> >> vt-d
> >> >>
> >> >
> >> >>
> >> >> Actually I also know there are some other patches 
> coming down the 
> >> >> pipeline to do pci passthrough to HVM guests without need for 
> >> >> hardware support (of course it is not so general; in 
> particular it 
> >> >> will only work for one special hvm guest).
> >> >> However, they deal with this interrupt issue quite 
> cunningly, by 
> >> >> inverting the interrupt polarity so that they get
> >> interrupts on both
> >> >> +ve and -ve edges of the INTx line. This allows the
> >> virtual interrupt
> >> >> wire to be 'wiggled' precisely according to the 
> behaviour of the 
> >> >> physical interrupt wire.
> >> >> Which is rather nice, although of course it does double
> >> the interrupt
> >> >> rate, which is not so great but perhaps acceptable for 
> the kind of 
> >> >> low interrupt rate devices that most people would want to
> >> hand off to
> >> >> a hvm guest.
> >> >>
> >> >
> >> >Just FYI.
> >> >
> >> >Neocleus' pass-through patches performs the "change 
> polarity" trick.
> >> >With changing the polarity, our motivation was to reflect the 
> >> >allocated device's assertion state to the HVM AS IS.
> >> >
> >> >Regarding the performance, using a USB 2.0 storage device 
> (working 
> >> >with DMA), a huge file copy was compared when working in 
> >> >pass-through, and when working in native (on the same 
> OS), the time 
> >> >differences were negligible so I'm not sure yet about the 
> impact of 
> >> >doubling the number of interrupts. The advantage of changing the 
> >> >polarity is the simplicity.
> >> >
> >> >Anyways, We'll release some patches during the day so you 
> could give 
> >> >your comments.
> >> >
> >> >Thanks,
> >> >Guy.
> >> >
> >>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.