[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: [Xen-devel] [pv_ops] e1000e: "Detected Tx Unit Hang"



I'm pretty sure the problem you're seeing is related to a broken firmware of
the specific chipset used for this Intel network card, not to Xen/pv_ops
kernel. I've had the same problems under high load with "semi-old"
Supermicro-Boxens I'm administering.

There's an Intel utility to patch the respective Firmware issue (i.e., the
network controller EEPROM), but it's not available online anymore (at least
last time I looked for it, I couldn't find it on the Intel site, where it
was prominently featured when I first looked for it).

I'll try to get access to it from the last machine that I applied this patch
to, but I'll only be able to do this some time during the (European) day
tomorrow.

--- Heiko.


-----Ursprüngliche Nachricht-----
Von: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] Im Auftrag von Jeremy
Fitzhardinge
Gesendet: Freitag, 21. Mai 2010 01:01
An: xen-devel@xxxxxxxxxxxxxxxxxxx
Cc: Stefan Kuhne
Betreff: Re: [Xen-devel] [pv_ops] e1000e: "Detected Tx Unit Hang"

On 05/20/2010 03:58 PM, Stefan Kuhne wrote:
> Am 21.05.2010 00:18, schrieb Jeremy Fitzhardinge:
>
> Hello Jeremy,
>
>   
>> e1000e works fine for me.  However, I did have problems with my Ibex
>> Peak-based system and the integrated ethernet devices; they would drop
>> off the PCIe bus (lspci -vx would show all 0xff for the config space),
>> which turned out to be some problem with ALPM (PCIe active link power
>> management).  Could this be what you're seeing?
>>
>>     
> my "lspci -vx" output:
>
> 02:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet
> Controller (Copper)
>         Subsystem: FIRST INTERNATIONAL Computer Inc Unknown device 4720
>         Flags: bus master, fast devsel, latency 0, IRQ 409
>         Memory at d0000000 (32-bit, non-prefetchable) [size=128K]
>         I/O ports at 2000 [size=32]
>         Capabilities: [c8] Power Management version 2
>         Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+
> Queue=0/0 Enable+
>         Capabilities: [e0] Express Endpoint IRQ 0
>         Capabilities: [100] Advanced Error Reporting
>         Capabilities: [140] Device Serial Number c6-a9-09-ff-ff-0b-14-00
> 00: 86 80 8c 10 07 05 10 00 00 00 00 02 10 00 00 00
> 10: 00 00 00 d0 00 00 00 00 01 20 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 09 15 20 47
> 30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01 00 00
>
> and the complete dmesg output:
> [ 9620.997466] 0000:02:00.0: peth0: Detected Tx Unit Hang:
> [ 9620.997469]   TDH                  <fc>
> [ 9620.997471]   TDT                  <1f>
> [ 9620.997473]   next_to_use          <1f>
> [ 9620.997475]   next_to_clean        <fc>
> [ 9620.997477] buffer_info[next_to_clean]:
> [ 9620.997479]   time_stamp           <8e2ec3>
> [ 9620.997481]   next_to_watch        <fc>
> [ 9620.997483]   jiffies              <8e3a25>
> [ 9620.997485]   next_to_watch.status <0>
> [ 9622.997490] 0000:02:00.0: peth0: Detected Tx Unit Hang:
> [ 9622.997496]   TDH                  <fc>
> [ 9622.997500]   TDT                  <1f>
> [ 9622.997503]   next_to_use          <1f>
> [ 9622.997507]   next_to_clean        <fc>
> [ 9622.997511] buffer_info[next_to_clean]:
> [ 9622.997515]   time_stamp           <8e2ec3>
> [ 9622.997519]   next_to_watch        <fc>
> [ 9622.997522]   jiffies              <8e41f5>
> [ 9622.997526]   next_to_watch.status <0>
> [ 9624.997536] 0000:02:00.0: peth0: Detected Tx Unit Hang:
> [ 9624.997541]   TDH                  <fc>
> [ 9624.997545]   TDT                  <1f>
> [ 9624.997549]   next_to_use          <1f>
> [ 9624.997553]   next_to_clean        <fc>
> [ 9624.997557] buffer_info[next_to_clean]:
> [ 9624.997561]   time_stamp           <8e2ec3>
> [ 9624.997565]   next_to_watch        <fc>
> [ 9624.997568]   jiffies              <8e49c5>
> [ 9624.997572]   next_to_watch.status <0>
> [ 9626.065848] eth0: port 1(peth0) entering disabled state
> [ 9629.910292] e1000e: peth0 NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: None
> [ 9629.910854] eth0: port 1(peth0) entering forwarding state
>   


OK, definitely different problem.  Does it happen immediately, or after
a while?  Under load?  Can you provide the full boot output, and cat
/proc/interrupts?

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.