[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Re: Performance Issues: I/O Wait



Yeah, HT is off (I don't even know if you can turn it on in the PE1950s!).  I'm getting some interesting stuff from tcpdump:


09:04:14.639345 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42499521 win 5080 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639345 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42500969 win 5804 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639373 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42502417 win 6528 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639373 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42503865 win 7252 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639374 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42505313 win 7976 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639374 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42506761 win 8700 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639375 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42508209 win 9424 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639396 IP swbox1.seakr.com.785 > alpha_0.seakr.com.nfs: . ack 42509657 win 10148 <nop,nop,timestamp 5108364 2431452318>

09:04:14.639647 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.4194578178: reply ERR 1448

09:04:14.639657 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.1879243268: reply ERR 1448

09:04:14.639661 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.4194609922: reply ERR 1448

09:04:14.639665 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.4009949700: reply ERR 1448

09:04:14.639670 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.4194630148: reply ERR 1448

09:04:14.639674 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.2533620228: reply ERR 1448

09:04:14.639720 IP alpha_0.seakr.com.nfs > swbox1.seakr.com.4194630148: reply ERR 1448


I've briefly looked at some Google results for "reply ERR 1448" but haven't come up with anything real concrete.  I'm going to keep looking at that one to see if that may lead somewhere.  In the meantime, I've disabled tx checksums in domU and am running a couple more tests to see if I can reproduce the long I/O waits at all.  I'll let you know how that turns out.  I also get some "reply ERR 1084" messages sprinkled in there, too.


I'll also try out some of the NFS settings to see if anything there helps and let you know.


Thanks for the help - much appreciated!


--Nick


>>> On Tue, Oct 23, 2007 at  9:17 PM, "Steve Senator (Senator Ent)" <sts@xxxxxxxxxxx> wrote:

Xen can exacerbate Linux SMP issues. Do you have hyperthreading turned 
on in your CPU's? If so, at least for testing, try turning it off.

Also, beyond turning of the TX offloading in both the dom0 and domU, 
is there any chance that there's another device attached to that 
bridge which would cause network delays? In particular, is there a 
device that may incorrectly see the domU IP as coming from the dom0 
due to an ARP conflict? I see that you've specified a fixed MAC 
address. Is there any chance that that same MAC address is used by the 
dom0? Perhaps the initrd is the one from dom0 and its got the MAC 
address set in the initrd to be the same as the one in the dom0?

Try tcpdumping from both domains and see if you see any 
retransmissions, or perhaps even a smoking gun like a system ARPing 
for itself when it should know better.

It's also possible that there's a transmission size problem. There 
have been reported problems of dom0<->domU traffic not honoring the 
MTU of the bridge or virtual device, which then forces retransmission 
when the receiving side cannot handle the larger buffer.

If NFS, try changing from TCP to UDP or modifying the rsize and wsize 
buffering to fit within the MTU of your (virtual) ethernet devices.

Hope this helps,
-Steve Senator



Quoting Nick Couchman <Nick.Couchman@xxxxxxxxx>:

> Hi, again...haven't had any responses to this, yet.

>>>> Nick Couchman 10/18/07 11:05 AM >>>
> Hey, everyone,
> I'm having some issues with a Xen DomU right related to performance. 
>  ... The culprit seems to be  high I/O wait times related to the 
> network interface.
>
> The host machine is a Dell PowerEdge 1950 with 2 x Dual-Core Xeon  
> processors (Xeon 5150 @ 2.66GHz, 1333 FSB).  ...  Building these 
> Linux distributions on the physical system takes  70-80 minutes 
> (real time) - on the DomU system it takes 130-140  minutes.
> ...
> vif=[ 'mac=00:16:3e:75:0d:be,bridge=xenbr108', ]


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.