[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] wireshark capture of failed download from mirage-www on ARM



On 21 July 2014 20:56, Richard Mortier <Richard.Mortier@xxxxxxxxxxxxxxxx> wrote:
> [ context for list: thomas' observation of failed download, and lots of 
> retransmissions generally, while bringing up mirage-www on ARM ]
>
> On 21 Jul 2014, at 09:27, Thomas Leonard <talex5@xxxxxxxxx> wrote:
>
>> On 21 July 2014 17:08, Richard Mortier <Richard.Mortier@xxxxxxxxxxxxxxxx> 
>> wrote:
>>
>
>>> On 21 Jul 2014, at 09:01, Thomas Leonard <talex5@xxxxxxxxx> wrote:
>>>
>>>> Here's the wireshark capture of a failed download. It does indeed say
>>>> the TCP checksum is wrong. Any idea what's going on?
>>>>
>>>> Note that on ARM it uses a different function to calculate this (which
>>>> I took from mirage-unix). It's in the #else block here:
>>>>
>>>> https://github.com/talex5/mirage-tcpip/blob/checksum/lib/checksum_stubs.c
>>>
>>> ack; will take a look after breakfast :)
>>>
>>> just to be clear -- the ARM version is using the code from L247 marked 
>>> "generic implementation"?
>>
>> Yes. The x86 version crashes on ARM because the 64-bit values aren't aligned.
>>
>>> two immediate questions -- is the checksum field definitely treated as all 
>>> zeros in the computation across the header?  and is the segment padded with 
>>> zeros to be N*16 bits for the purposes of the computation (but the pad not 
>>> transmitted)?
>>
>> No idea. I haven't changed any code around there.
>
> this is weird-- wireshark says that the first transmission of that segment 
> (frame#13) has an invalid checksum while the retransmission (#17) has a valid 
> checksum. but the two checksums are the same!  however #13 appears to have 
> almost no valid data in it -- after the first 74 bytes (which are the same in 
> both #13 and #17), the payload in #13 is zeroed out.
>
> so i guess the cstruct buffer is being recycled too soon (after the checksum 
> calculation but before the data is actually transmitted) or something?
>
> anil, balraj (or anyone else!)-- has that part of the stack been changed 
> recently?

I'm seeing strange things using a simpler test case now:

  let start c s =
    S.listen_tcpv4 s ~port:8000 (fun flow ->
        let dst, dst_port = S.TCPV4.get_dest flow in
        C.log_s c (green "new tcp connection from %s %d"
                     (Ipaddr.V4.to_string dst) dst_port)
        >>= fun () ->
        let data = Cstruct.of_string "Hello" in
        S.TCPV4.write flow data
        >>= fun () ->
        S.TCPV4.close flow
      );
    S.listen s

This is also failing. I added a hexdump to mirage-net-xen and got this
in Netif.writev:

f0 1f af 6a 9b 95 c0 ff ee c0 ff ee 08 00 45 00
00 2d 52 95 00 00 26 06 c0 c8 c0 a8 00 12 c0 a8
00 0b 1f 40 b4 ca 1a fe b5 69 5e 8c dd fe 50 18
ff ff 29 8a 00 00

48 65 6c 6c 6f

That looks correct. The first block is the header, the second is the
payload. In wireshare, the header is identical but the payload is
different (20 00 00 00 08), which matches what you're seeing.

So I guess there's some problem sending the second page to the ring.
Suggestions from people who know this code would be great! Could just
be a missing barrier or something.


-- 
Dr Thomas Leonard        http://0install.net/
GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.