[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mirage-www



That was my fault, sorry! This sort of thing should be fixed soon when
we can share C bindings more easily among the cross-compilation targets.

-anil

On 10 Sep 2012, at 22:31, David Scott <scott.dj@xxxxxxxxx> wrote:

> Right, problem solved: TCP checksum on xen was slightly broken. There
> was a fix made to Unix to cope with odd-length packets, this needed to
> be applied to xen as well. Many of the packets had valid checksums,
> and linux would ACK up to the sequence number in the last one of
> those.
> 
> Now that the TCP bug is squashed, I can get back to writing my blog
> post -- all just a day in the life of a mirage hacker :-)
> 
> On Sat, Sep 8, 2012 at 9:06 AM, Dave Scott <Dave.Scott@xxxxxxxxxxxxx> wrote:
>> 
>> 
>> On Sep 7, 2012, at 5:59 PM, "Anil Madhavapeddy" <anil@xxxxxxxxxx> wrote:
>> 
>>> On Fri, Sep 07, 2012 at 02:02:45PM +0100, David Scott wrote:
>>>> Hi,
>>>> 
>>>> I've built mirage-www for xen and modified the static IP to fit my
>>>> environment. (I tried DHCP first but this didn't work -- if I get the
>>>> time I'll try to debug).
>>>> 
>>>> I can ping the server fine, and it's certainly receiving a lot of
>>>> traffic on my (probably fairly busy) local network. When I try to
>>>> fetch a URL the TCP connection hangs. On the console I get:
>>>> 
>>>> Dispatch: dynamic URL /
>>>> ... irrelevant spam
>>>> TCP retransmission on timer seq = -889321980
>>>> 
>>>> I've attached a small tcpdump of the conversation. I started with ping
>>>> and then tried HTTP. According to tcpdump/wireshark it went like this:
>>>> 
>>>> $ tcpdump -r mirage.pcap -n
>>>> reading from file mirage.pcap, link-type EN10MB (Ethernet)
>>>> 13:47:07.543119 IP 10.80.2.32 > 10.80.239.140: ICMP echo request, id
>>>> 9300, seq 1, length 64
>>>> 13:47:07.543756 IP 10.80.239.140 > 10.80.2.32: ICMP echo reply, id
>>>> 9300, seq 1, length 64
>>>> 13:47:08.542112 IP 10.80.2.32 > 10.80.239.140: ICMP echo request, id
>>>> 9300, seq 2, length 64
>>>> 13:47:08.542422 IP 10.80.239.140 > 10.80.2.32: ICMP echo reply, id
>>>> 9300, seq 2, length 64
>>>> 13:47:09.541288 IP 10.80.2.32 > 10.80.239.140: ICMP echo request, id
>>>> 9300, seq 3, length 64
>>>> 13:47:09.541609 IP 10.80.239.140 > 10.80.2.32: ICMP echo reply, id
>>>> 9300, seq 3, length 64
>>>> 13:47:10.541286 IP 10.80.2.32 > 10.80.239.140: ICMP echo request, id
>>>> 9300, seq 4, length 64
>>>> 13:47:10.541580 IP 10.80.239.140 > 10.80.2.32: ICMP echo reply, id
>>>> 9300, seq 4, length 64
>>>> 13:47:11.541286 IP 10.80.2.32 > 10.80.239.140: ICMP echo request, id
>>>> 9300, seq 5, length 64
>>>> 13:47:11.541932 IP 10.80.239.140 > 10.80.2.32: ICMP echo reply, id
>>>> 9300, seq 5, length 64
>>>> 
>>>> -- so far so good, this is just my initial pings. Switching to 'wget
>>>> http://10.80.239.140/'
>>>> 
>>>> 13:47:14.241216 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [S], seq
>>>> 2284582709, win 5840, options [mss 1460,sackOK,TS val 909789846 ecr
>>>> 0,nop,wscale 6], length 0
>>>> 13:47:14.242365 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [S.],
>>>> seq 3536243828, ack 2284582710, win 65535, options [mss 1380,wscale
>>>> 2,eol], length 0
>>>> 13:47:14.242387 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [.], ack
>>>> 1, win 92, length 0
>>>> 
>>>> -- TCP connection established
>>>> 
>>>> 13:47:14.242417 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [P.],
>>>> seq 1:112, ack 1, win 92, length 111
>>>> 
>>>> -- HTTP GET / sent
>>>> 
>>>> 13:47:14.242869 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 1:18, ack 112, win 65535, length 17
>>>> 
>>>> -- HTTP/1.1 200 OK replied
>>>> 
>>>> 13:47:14.242880 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [.], ack
>>>> 18, win 92, length 0
>>>> 13:47:14.243369 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 18:68, ack 112, win 65535, length 50
>>>> 
>>>> -- "content-length..." replied
> 
> --- and this one had the invalid checksum.
> 
>>>> 
>>>> 13:47:14.243556 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 68:1448, ack 112, win 65535, length 1380
>>>> 
>>>> ... after this chunks of the blog post are transmitted
>>>> 
>>>> 13:47:14.243562 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [.], ack
>>>> 18, win 92, length 0
>>>> 13:47:14.243566 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 1448:2828, ack 112, win 65535, length 1380
>>>> 13:47:14.243571 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [.], ack
>>>> 18, win 92, length 0
>>>> 13:47:14.243575 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 2828:4164, ack 112, win 65535, length 1336
>>>> 13:47:14.243579 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [.], ack
>>>> 18, win 92, length 0
>>>> 13:47:14.243602 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 4164:5544, ack 112, win 65535, length 1380
>>>> 13:47:14.243608 IP 10.80.2.32.37158 > 10.80.239.140.80: Flags [.], ack
>>>> 18, win 92, length 0
>>>> 13:47:14.243776 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 18:68, ack 112, win 65535, length 50
>>>> 
>>>> -- fast retransmit of the "content-length" packet
>>>> 
>>>> 13:47:18.242606 IP 10.80.239.140.80 > 10.80.2.32.37158: Flags [P.],
>>>> seq 18:68, ack 112, win 65535, length 50
>>>> 
>>> 
>>> So the server is receiving the HTTP body ACKs (i.e. after the
>>> content-length) on the wire, but not passing them up to the TCP stack
>>> for some reason. The stack believes they weren't ACKed, and is correctly
>>> retransmitting (from its perspective).
>>> 
>>> Balraj saw the same problem a couple of months ago, and tracked it down to
>>> possible corruption of the body of packets.  Balraj, did you ever narrow
>>> this down further?
>>> 
>>> Iirc, the first ~50 bytes of some packets were overwritten near the very
>>> start. Dave, could you look at the packet bodies and see if they all look
>>> correct?  Unfortunately, I can't reproduce this on my setup (mirage-www
>>> works fine).   What is the TCP client / browser you are using?
>> 
>> I tried Linux chrome and when that failed, switched to wget.
>> 
>> I'll investigate the packet corruption possibility first. I'll take the 
>> scenic route: I noticed one of the cstruct examples is for pcap file 
>> parsing, I'll try to add the ability to log the packets directly to a local 
>> mirage block device. I wanted a similar thing for xenstore anyway.
>> 
>> If that fails I'll resort to printfs :) the problem always happens after a 
>> handful of packets so this should be easy.
>> 
>> Cheers,
>> Dave
>> 
> 
> 
> 
> -- 
> Dave Scott
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.