I'm having an issue where a passthrough Ethernet interface (to PV Linux) stops receiving packets and goes into a state where all packets are tracked as FIFO errors (rx_fifo_errors). I have been unable to link the behavior to a volume or type of traffic thus far and all attempts to tweak the interface itself (e.g., disable TOE, change memory/buffers, etc.) have not helped. I've also tried various combinations of the following:

    msi=1 iommu=1
    msi=1 iommu=1,no-amd-iommu-perdev-intremap


When the NIC goes into this state, the domU must be restarted to recover it. I see this in dom0 'xl dmesg':
    (XEN) [2015-08-31 22:44:21] AMD-Vi: IO_PAGE_FAULT: domain = 12, device id = 0x4302, fault address = 0x6c8b2d340, flags = 0x2

However, the NIC does not fault permanently every time I see this error in dmesg; often it will fault and then recover. But, eventually it becomes permanently failed until a restart of the domU. The time to fail has varied from minutes to hours.

I'm running Xen 4.5.1 which I recently upgraded to as this problem also existed in 4.4.2. I've tried various kernel versions ranging from 3.11.x to 3.18.16 where they matched on dom0 and domU.

The Ethernet card is an Intel card with the igb driver and the motherboard is SuperMicro with AMD.

What's the best way to troubleshoot this issue?

