[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.



On Wed, Dec 08, 2010 at 05:44:39PM +0100, Sander Eikelenboom wrote:
> Wednesday, December 8, 2010, 4:48:57 PM, you wrote:
> 
> > On Wed, Dec 08, 2010 at 03:05:50PM +0100, Sander Eikelenboom wrote:
> >> 
> >> Wednesday, December 8, 2010, 2:48:48 PM, you wrote:
> >> 
> >> > On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote:
> >> >> Hello Mark,
> >> 
> >> > Hi
> >> 
> >> >> 
> >> >> Just a recap:
> >> >>      you pass through:
> >> >>      - 3 physical nics/IGB
> >> >>      - 1 ISDN pci ISDN box
> >> 
> >> > The redfone box runs on 1 of the nics - its not seperate. It converts
> >> > ISDN to TDMoE see here.. http://www.red-fone.com/
> >> 
> >> So the problem is probably with the igb's.
> >> Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , 
> >> perhaps worth a try ?
> 
> > Tried this - doesn't help.
> 
> >> 
> >> Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) 
> >> to see if it's due to any of the special offload features ?
> 
> > Haven't got any other NIC's to try unfortunately. Even if it did work
> > with 1, it would be no use to me as I need 3.
> 
> I understand, but simplifying the setup and trying to isolate the problem, 
> could clarify things.
> 
> I also read you previous thread, and i saw you hide the 02:00.0 and 03:00.0 
> with xen-pciback (e1000e driver) there, but now you seem to be passing 
> through 08:00.0 and 08:00.1 (igb) ?
> So i assume you have already tried 2 different NIC's
> 
> http://download.intel.com/design/network/specupdt/82574.pdf though shows some 
> errata regarding msi-x interrupts and timing issues and workarounds on the 
> 82574 (02:00.0 and 03:00.0) nics.
> 

I was initially using the onboard NICs (e1000e) when I had the crashing
problem. To try to get around this, I disabled all the msi based stuff I
could find - which seemed to correct the crashing issue. In order to do
this I needed 3 NIC's because bridging would not work at the same time
as passthrough (would not show all devices being passed through?) hence
starting to use the igb based NIC card thats also in the machine.

Unfortunately the servers I've been testing on need to go in to
production now, so can't test any further (hence sticking the voip stuff
on to a physical box). Xen works really well for me when I don't use
pci-passthrough!

Regards,
Mark

> --
> Sander
> 
> 
> >> 
> >> 
> >> >>      - all using msi/msi-x interrupts ?
> >> 
> >> > I tried using msi/msi-x interrupts, but it caused the raid card to drop
> >> > off (after some use) and provided seemingly even worse performance than
> >> > pegging everything back to legacy.
> >> 
> >> >> 
> >> >> Have you tried using a PV domU instead of a HVM domU ?
> >> 
> >> > I initially tried PV but had issues with the igb NIC's. There was
> >> > another thread somewhere about my issues with that.
> >> 
> >> 
> >> >> Have you tried passing through only the ISDN box, and let the network 
> >> >> run with the xen backend/frontend to rule out the IGB/network stuff ?
> >> >> 
> >> >> 
> >> >> --
> >> >> Sander
> >> >> 
> >> >> 
> >> >> 
> >> >> Wednesday, December 8, 2010, 1:58:55 PM, you wrote:
> >> >> 
> >> >> > Hi - Apologies to top post this, but after alot of testing, I believe
> >> >> > there must be an issue with IRQ's going missing between domU and dom0.
> >> >> > Unfortunately I have no data to prove this!
> >> >> 
> >> >> > With msitranslate=0 as detailed below, and pci=nomsi in the guest 
> >> >> > kernel
> >> >> > grub config, all 3 NIC's appear OK in the domU however I still had
> >> >> > issues with the red-fone ISDN box. The interrupts were showing 
> >> >> > correctly
> >> >> > (2000/s) in the domU but communication to the device via the NIC was
> >> >> > still being interrupted (as shown in the asterisk console)Note that to
> >> >> > get the igb driver to allow this many interrupts, the
> >> >> > InterruptThrottleRate was set to 0. The same config (red-fone box,
> >> >> > asterisk etc) works fine with a physical server.
> >> >> 
> >> >> > There is also the additional issue that I could not get the 
> >> >> > passthrough
> >> >> > NIC's to show correctly when I also had a bridge setup.
> >> >> 
> >> >> > Throughout my testing however, I could not get the machine to crash.
> >> >> 
> >> >> > Not sure where to go with this one. For now we are keeping our VoIP
> >> >> > servers physical when ISDN connections are required.
> >> >> 
> >> >> > Regards,
> >> >> > Mark
> >> >> 
> >> >> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote:
> >> >> >> > 
> >> >> >> > In my new test setup, I have seen some strange behaviour. 1 of the 
> >> >> >> > HVM's
> >> >> >> > (with identical config in dom0 and domU) suddenly would not allow 
> >> >> >> > the
> >> >> >> > igb driver to be loaded in domU, even though the device was 
> >> >> >> > visible in
> >> >> >> 
> >> >> >> Let's create a new thread for this other issue.
> >> >> >> 
> >> >> >> > lspci. Shutting the machine down, removing the power cord, waiting 
> >> >> >> > 5
> >> >> >> > seconds then plugging it in again corrected that issue - Is this
> >> >> >> > possibly a motherboard bug? I have also disabled the SR-IOV
> >> >> >> > functionality in the BIOS incase this is causing any issues.
> >> >> >> > 
> >> >> >> > In addition, to try to correct the MSI issue noted above, I have 
> >> >> >> > changed
> >> >> >> > my pci= line to the following:
> >> >> >> > 
> >> >> >> > pci=[ '08:00.0,msitranslate=0', '08:00.1,msitranslate=0' ]
> >> >> >> 
> >> >> >> With the msi_translate=1 turned on the DomU HVM guests did work, 
> >> >> >> right?
> >> >> >> 
> >> >> >> > 
> >> >> >> > This has stopped the "already in use on device" log, and the 
> >> >> >> > devices
> >> >> >> > appear to show correctly in the domU. Is it safe to disable
> >> >> >> > msitranslate? as I understand it, its for allowing multifunction 
> >> >> >> > devices
> >> >> >> > to be seen as such in domU. Is that correct?
> >> >> >> > 
> >> >> >> > I haven't been able to reproduce the dropped raid issue yet, but I 
> >> >> >> > am
> >> >> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to 
> >> >> >> > cause
> >> >> >> > this due to their very high interrupt usage (2000 per second).
> >> >> >> 
> >> >> >> OK.
> >> >> >> > 
> >> >> >> > In the mean time, I can see the following in the qemu-dm logs now 
> >> >> >> > with
> >> >> >> > the msitranslate=0 enabled. Is it anything to worry about?
> >> >> >> 
> >> >> >> Well, the  "Error" ones are pretty bad, thought I am having a hard 
> >> >> >> time
> >> >> >> understanding what it means. Lets copy some of the QEMU folks on 
> >> >> >> this.
> >> >> >> 
> >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to 
> >> >> >> > unused Base Address Register. [00:05.0][Offset:14h][Length:4]
> >> >> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0
> >> >> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0
> >> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to 
> >> >> >> > unused Base Address Register. [00:06.0][Offset:14h][Length:4]
> >> >> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0
> >> >> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0
> >> >> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59
> >> >> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61
> >> >> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69
> >> >> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71
> >> >> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
> >> >> >> > already function.
> >> >> >> > 
> >> >> >> > > 
> >> >> >> > > Not yet. Need to serial log of the Linux kernel and the Xen 
> >> >> >> > > hypervisor when your
> >> >> >> > > machine is toast. I mentioned in the previous email the key 
> >> >> >> > > sequences - look on Google
> >> >> >> > > on how to pass in SysRQ if you are using a serial concentrator.
> >> >> >> > 
> >> >> >> > I will do this when I can get the machine to crash.
> >> >> >> > 
> >> >> >> > Best Regards,
> >> >> >> > Mark
> >> >> >> > 
> >> >> >> > _______________________________________________
> >> >> >> > Xen-devel mailing list
> >> >> >> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> >> >> > http://lists.xensource.com/xen-devel
> >> >> 
> >> >> 
> >> >> 
> >> >> 
> >> >> 
> >> >> -- 
> >> >> Best regards,
> >> >>  Sander                            mailto:linux@xxxxxxxxxxxxxx
> >> >> 
> >> 
> >> 
> >> 
> >> -- 
> >> Best regards,
> >>  Sander                            mailto:linux@xxxxxxxxxxxxxx
> >> 
> >> 
> >> _______________________________________________
> >> Xen-users mailing list
> >> Xen-users@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-users
> 
> 
> 
> 
> -- 
> Best regards,
>  Sander                            mailto:linux@xxxxxxxxxxxxxx
> 
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.