[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TCP wait_for transmit question



My mistake, yes User_buffer.write should block if the buffer is full,
but it doesn't - it always succeeds.  I can't remember why, but I put
the check for buffer fullness in Flow.write (maybe there was something
similar there earlier and I just modified it).  Flow.write calls
Tcp.Pcb.write which then calls User_buffer.write.  For now if you use
Flow.write it should work as required.  But I agree that it should be
moved to User_buffer.write, which should then block (or fail) if the
buffer is full.

Here is the current write from flow.ml and the particular check to see
if the buffer has room is Tcp.Pcb.write_available:

  let rec write t view =
    let vlen = Bitstring.bitstring_length view / 8 in
    match Tcp.Pcb.write_available t with
    |0 -> (* block for window to open *)
      Tcp.Pcb.write_wait_for t 1 >>
      write t view
    |len when len < vlen -> (* do a short write *)
      let v' = Bitstring.subbitstring view 0 (len * 8) in
      Tcp.Pcb.write t v' >>
      write t (Bitstring.subbitstring view (len*8) ((vlen-len)*8))
    |len -> (* full write *)
      Tcp.Pcb.write t view



On Thu, Jul 12, 2012 at 1:08 AM, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:
> On 12 Jul 2012, at 01:03, Haris Rotsos wrote:
>>>
>>> Haris, was your test case just calling Tcp.Pcb.write continuously and 
>>> finding that
>>> it ran out of memory?
>>
>> yes. And by the way, because the code is written over the ns3
>> simulation platform, I think the a call that that pushed packets to a
>> network interface will never block. The simulation hasn't got this
>> requirement fixed yet.
>>
>> This I guess in the case of xen or unix, will be handled with more
>> care as the write may block and create naturally a context switch in
>> the thread scheduler.
>
> Definitely; the Xen Netif has a fixed set of rings slots, and the Unix backend
> just uses (slow) blocking tuntap I/O.  Both apply backpressure as a result.
>
>>
>>>
>>> In that case, it may be a regression that is the same problem as the ARP 
>>> race
>>> condition (the OS.Netif.write is now too asynchronous).
>>
>> why would that affect the arp functionality?
>
> Only because our ARP support is super-minimal, and doesn't have a retransmit
> timer or anything.  So you get one ARP query transmitted, and if it is lost 
> for
> any reason, we never retransmit.  I saw a few cases where we got stuck as a
> result.  It's a quick job to add a retransmission timer to make it more 
> RFC-compliant
> though.
>
> -anil



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.