[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] "swiotlb buffer is full" problem with tg3 and kernel 3.16.0-4-686-pae on Xen 4.4.1
Hi, After upgrading to Debian jessie, and consequently to the default Linux kernel 3.16.0-4-686-pae and Xen hypervisor 4.4.1-amd64 in that distribution, I'm having problems with the tg3 network driver under high load. Unfortunately this affects a production system that I am administrating. It usually happens when doing a DRBD sync. Here is one such event: [ 4765.528635] block drbd0: Began resync as SyncSource (will sync 886784 KB [221696 bits set]) [ 4765.528654] block drbd0: updated sync UUID 09891C136111799E:F7FD1C0A50225596:F7FC1C0A50225596:F7FB1C0A50225596 [ 4768.992280] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4769.400296] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4770.216360] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4771.852283] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4775.120286] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4775.776027] tg3 0000:02:00.0: swiotlb buffer is full (sz: 32768 bytes) [ 4775.778814] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4775.780995] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4775.783345] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4775.785097] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4775.988290] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4776.396285] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4777.212295] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4778.848298] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4781.664292] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4782.120285] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4788.672288] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4793.776046] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time expired, ko = 6 [ 4794.752314] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4799.776046] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time expired, ko = 5 [ 4801.760290] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4805.776040] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time expired, ko = 4 [ 4811.776040] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time expired, ko = 3 [ 4817.776050] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time expired, ko = 2 [ 4823.776079] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time expired, ko = 1 [ 4827.936300] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes) [ 4829.776069] drbd base-disk: peer( Secondary -> Unknown ) conn( SyncSource -> Timeout ) [ 4829.776088] block drbd0: drbd_send_block() failed Sometimes I also see the message "swiotlb_tbl_map_single: 8 callbacks suppressed" or similar between the "buffer full" messages. Sometimes the sync finishes, sometimes it stalls and fails completely. The problem only occurs when running Linux 3.16.0-4-686-pae under Xen 4.4.1. It does NOT occur when booting the same kernel without Xen, or when booting the corresponding amd64 kernel (3.16.0-4-amd64) with or without Xen. There was no problem in Debian wheezy before the upgrade (kernel 3.2.0-4-686-pae and Xen Hypervisor 4.1.3-amd64). The problem also occurs when only dom0 is running (all domU VMs shut down). I found the thread "tg3 NIC driver bug in 3.14.x under Xen" (http://www.spinics.net/lists/netdev/msg324124.html) which looks like a similar issue, but I don't understand exactly what is going on there and what I could do to fix or debug it further. Shall I try to build a 3.16.0-4-686-pae kernel with "CONFIG_NEED_DMA_MAP_STATE=y"? Shall I try to set the 'iommu' and/or 'swiotlb' kernel parameters? To what values? Any help or hint how to fix or work around this issue is very much appreciated. Also hints how to debug this further are welcome. Thanks, Marco P.S. Here is some information that might help figuring out what's going on: ------------------------------------------------------------------- kepler:~# ethtool -S eth0 | grep -v ': 0$' NIC statistics: rx_octets: 42531865 rx_ucast_packets: 582596 rx_mcast_packets: 127 rx_bcast_packets: 1 tx_octets: 8692263469 tx_ucast_packets: 5755264 tx_mcast_packets: 10 ------------------------------------------------------------------- ------------------------------------------------------------------- kepler:~# ethtool -i eth0 driver: tg3 version: 3.137 firmware-version: 5722-v3.09, ASFIPMI v6.03 bus-info: 0000:02:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no ------------------------------------------------------------------- ------------------------------------------------------------------- kepler:~# lspci -vvv -s 02:00.0 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit Ethernet PCI Express Subsystem: IBM IBM System x3350 (Machine type 4192) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 59 Region 0: Memory at e8200000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at <ignored> [disabled] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data Product Name: Broadcom NetXtreme Gigabit Ethernet Controller Read-only fields: [PN] Part number: BCM95722 [EC] Engineering changes: 106679-15 [SN] Serial number: 0123456789 [MN] Manufacture ID: 31 34 65 34 [RV] Reserved: checksum good, 28 byte(s) reserved Read/write fields: [YA] Asset tag: XYZ01234567 [RW] Read-write area: 107 byte(s) free End Capabilities: [58] Vendor Specific Information: Len=78 <?> Capabilities: [e8] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0200c Data: 4121 Capabilities: [d0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <4us, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [13c v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 00-21-5e-ff-fe-4d-2c-13 Capabilities: [16c v1] Power Budgeting <?> Kernel driver in use: tg3 ------------------------------------------------------------------- ------------------------------------------------------------------- kepler:~# brctl show bridge name bridge id STP enabled interfaces xenbrext0 8000.00215e4d2c14 no eth1 xenbrint0 8000.00215e4d2c13 no eth0 kepler:~# ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:21:5e:4d:2c:13 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:582865 errors:0 dropped:0 overruns:0 frame:0 TX packets:5755690 errors:0 dropped:1153 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:42557655 (40.5 MiB) TX bytes:8692339211 (8.0 GiB) Interrupt:16 kepler:~# ifconfig xenbrint0 xenbrint0 Link encap:Ethernet HWaddr 00:21:5e:4d:2c:13 inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: 2001:1620:206b:1::2:1/64 Scope:Global inet6 addr: fe80::221:5eff:fe4d:2c13/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:582461 errors:0 dropped:0 overruns:0 frame:0 TX packets:329904 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:32044143 (30.5 MiB) TX bytes:8330130321 (7.7 GiB) ------------------------------------------------------------------- ------------------------------------------------------------------- kepler:~# cat /proc/version Linux version 3.16.0-4-686-pae (debian-kernel@xxxxxxxxxxxxxxxx) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 SMP Debian 3.16.7-ckt9-3~deb8u1 (2015-04-24) kepler:~# grep -e SWIOTLB -e CONFIG_NEED_DMA_MAP_STATE /boot/config-* /boot/config-3.16.0-4-686-pae:CONFIG_SWIOTLB=y /boot/config-3.16.0-4-686-pae:CONFIG_SWIOTLB_XEN=y /boot/config-3.16.0-4-amd64:CONFIG_NEED_DMA_MAP_STATE=y /boot/config-3.16.0-4-amd64:CONFIG_SWIOTLB=y /boot/config-3.16.0-4-amd64:CONFIG_SWIOTLB_XEN=y ------------------------------------------------------------------- ------------------------------------------------------------------- kepler:~# xen info host : kepler release : 3.16.0-4-686-pae version : #1 SMP Debian 3.16.7-ckt9-3~deb8u1 (2015-04-24) machine : i686 nr_cpus : 2 max_cpu_id : 1 nr_nodes : 1 cores_per_socket : 2 threads_per_core : 1 cpu_mhz : 2400 hw_caps : bfebfbff:20100800:00000000:00000900:0000e39d:00000000:00000001:00000000 virt_caps : total_memory : 8189 free_memory : 3999 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 4 xen_extra : .1 xen_version : 4.4.1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xff400000 xen_changeset : xen_commandline : placeholder com1=115200,8n1 console=com1 dom0_mem=4096M,max:4096M cc_compiler : gcc (Debian 4.9.2-10) 4.9.2 cc_compile_by : waldi cc_compile_domain : debian.org cc_compile_date : Mon Apr 6 19:49:18 UTC 2015 xend_config_format : 4 ------------------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |