[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.



Wednesday, December 8, 2010, 2:48:48 PM, you wrote:

> On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote:
>> Hello Mark,

> Hi

>> 
>> Just a recap:
>>      you pass through:
>>      - 3 physical nics/IGB
>>      - 1 ISDN pci ISDN box

> The redfone box runs on 1 of the nics - its not seperate. It converts
> ISDN to TDMoE see here.. http://www.red-fone.com/

So the problem is probably with the igb's.
Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , 
perhaps worth a try ?

Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) to 
see if it's due to any of the special offload features ?


>>      - all using msi/msi-x interrupts ?

> I tried using msi/msi-x interrupts, but it caused the raid card to drop
> off (after some use) and provided seemingly even worse performance than
> pegging everything back to legacy.

>> 
>> Have you tried using a PV domU instead of a HVM domU ?

> I initially tried PV but had issues with the igb NIC's. There was
> another thread somewhere about my issues with that.


>> Have you tried passing through only the ISDN box, and let the network run 
>> with the xen backend/frontend to rule out the IGB/network stuff ?
>> 
>> 
>> --
>> Sander
>> 
>> 
>> 
>> Wednesday, December 8, 2010, 1:58:55 PM, you wrote:
>> 
>> > Hi - Apologies to top post this, but after alot of testing, I believe
>> > there must be an issue with IRQ's going missing between domU and dom0.
>> > Unfortunately I have no data to prove this!
>> 
>> > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel
>> > grub config, all 3 NIC's appear OK in the domU however I still had
>> > issues with the red-fone ISDN box. The interrupts were showing correctly
>> > (2000/s) in the domU but communication to the device via the NIC was
>> > still being interrupted (as shown in the asterisk console)Note that to
>> > get the igb driver to allow this many interrupts, the
>> > InterruptThrottleRate was set to 0. The same config (red-fone box,
>> > asterisk etc) works fine with a physical server.
>> 
>> > There is also the additional issue that I could not get the passthrough
>> > NIC's to show correctly when I also had a bridge setup.
>> 
>> > Throughout my testing however, I could not get the machine to crash.
>> 
>> > Not sure where to go with this one. For now we are keeping our VoIP
>> > servers physical when ISDN connections are required.
>> 
>> > Regards,
>> > Mark
>> 
>> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote:
>> >> > 
>> >> > In my new test setup, I have seen some strange behaviour. 1 of the HVM's
>> >> > (with identical config in dom0 and domU) suddenly would not allow the
>> >> > igb driver to be loaded in domU, even though the device was visible in
>> >> 
>> >> Let's create a new thread for this other issue.
>> >> 
>> >> > lspci. Shutting the machine down, removing the power cord, waiting 5
>> >> > seconds then plugging it in again corrected that issue - Is this
>> >> > possibly a motherboard bug? I have also disabled the SR-IOV
>> >> > functionality in the BIOS incase this is causing any issues.
>> >> > 
>> >> > In addition, to try to correct the MSI issue noted above, I have changed
>> >> > my pci= line to the following:
>> >> > 
>> >> > pci=[ '08:00.0,msitranslate=0', '08:00.1,msitranslate=0' ]
>> >> 
>> >> With the msi_translate=1 turned on the DomU HVM guests did work, right?
>> >> 
>> >> > 
>> >> > This has stopped the "already in use on device" log, and the devices
>> >> > appear to show correctly in the domU. Is it safe to disable
>> >> > msitranslate? as I understand it, its for allowing multifunction devices
>> >> > to be seen as such in domU. Is that correct?
>> >> > 
>> >> > I haven't been able to reproduce the dropped raid issue yet, but I am
>> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to cause
>> >> > this due to their very high interrupt usage (2000 per second).
>> >> 
>> >> OK.
>> >> > 
>> >> > In the mean time, I can see the following in the qemu-dm logs now with
>> >> > the msitranslate=0 enabled. Is it anything to worry about?
>> >> 
>> >> Well, the  "Error" ones are pretty bad, thought I am having a hard time
>> >> understanding what it means. Lets copy some of the QEMU folks on this.
>> >> 
>> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused 
>> >> > Base Address Register. [00:05.0][Offset:14h][Length:4]
>> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0
>> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0
>> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused 
>> >> > Base Address Register. [00:06.0][Offset:14h][Length:4]
>> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0
>> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0
>> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59
>> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61
>> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69
>> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71
>> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79
>> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
>> >> > already function.
>> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
>> >> > already function.
>> >> > 
>> >> > > 
>> >> > > Not yet. Need to serial log of the Linux kernel and the Xen 
>> >> > > hypervisor when your
>> >> > > machine is toast. I mentioned in the previous email the key sequences 
>> >> > > - look on Google
>> >> > > on how to pass in SysRQ if you are using a serial concentrator.
>> >> > 
>> >> > I will do this when I can get the machine to crash.
>> >> > 
>> >> > Best Regards,
>> >> > Mark
>> >> > 
>> >> > _______________________________________________
>> >> > Xen-devel mailing list
>> >> > Xen-devel@xxxxxxxxxxxxxxxxxxx
>> >> > http://lists.xensource.com/xen-devel
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Best regards,
>>  Sander                            mailto:linux@xxxxxxxxxxxxxx
>> 



-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.