[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TCP wait_for transmit question



That sounds right to me. I think we need to block the write call in the
TCP module if the output buffer is full, and not in the Channel module
(which can and will have other transport mechanisms such as vchan).

-anil

On 12 Jul 2012, at 00:58, Haris Rotsos wrote:

> Looking at the code there is the following logic:
> 
> an application will call Channel.write_buffer or write_string to write
> data to the socket:
> 
> channel.ml:180
> let write_buffer t buf =
>  queue_obuf t;
>  t.obufq <- buf :: t.obufq
> 
> (* Queue the active write buffer onto the write queue, resizing the
> * view if necessary to the correct size. *)
> let queue_obuf t =
>  match t.obuf with
>  |None -> ()
>  |Some buf when Cstruct.len buf = t.opos -> (* obuf is full *)
>    t.obufq <- buf :: t.obufq;
>    t.obuf <- None
>  |Some buf when t.opos = 0 -> (* obuf wasnt ever used, so discard *)
>    t.obuf <- None
>  |Some buf -> (* partially filled obuf, so resize *)
>    let buf = Cstruct.sub buf 0 t.opos in
>    t.obufq <- buf :: t.obufq;
>    t.obuf <- None
> 
> then in order to push the data out to the the networks the flush
> method should be called:
> 
> let rec flush t =
>  queue_obuf t;
>  let l = List.rev t.obufq in
>  t.obufq <- [];
>  Flow.writev t.flow l
> 
> the blocking call in the closure is Flow.writev. which does the following:
> 
> let writev t views =
>   Tcp.Pcb.writev t views
> 
> let writev pcb data = User_buffer.Tx.write pcb.utx data
> 
> module Tx :
> let write t datav =
>  let l = lenv datav in
>  let mss = Int32.of_int (Window.tx_mss t.wnd) in
>  match Lwt_sequence.is_empty t.buffer &&
>    (l = mss || not (Window.tx_inflight t.wnd)) with
>  | false ->
>      t.bufbytes <- Int32.add t.bufbytes l;
>      List.iter (fun data -> ignore(Lwt_sequence.add_r data t.buffer))
> datav;
>      if t.bufbytes < mss then
>        return ()
>      else
>        clear_buffer t
>  | true ->
>      let avail_len = available_cwnd t in
>      match avail_len < l with
>      | true ->
>          t.bufbytes <- Int32.add t.bufbytes l;
>          List.iter (fun data -> ignore(Lwt_sequence.add_r data
> t.buffer)) datav;
>          return ()
>      | false ->
>          let max_size = Window.tx_mss t.wnd in
>          transmit_segments ~mss:max_size ~txq:t.txq datav
> 
> so the most of the time the write will end up in the false case of the
> first match. I nthis closure if I write constantly 1460 bytes packet
> segments, then the call will never block right?
> 
> Maybe an additional blocking check on the buffer size should be added
> in order to cover this case?
> 
> 
> On 11 July 2012 14:06, Balraj Singh <balraj.singh@xxxxxxxxxxxx> wrote:
>> This is what the write buffer is supposed to do and it definitely
>> improves overall performance.  It allows the app and tcp to not be in
>> lock step.  The app can dump a chunk of data to be sent in the write
>> buffer, when tcp gets acks that open up the window it has data ready
>> to be sent.  This buffer also needed to implement Nagle's algorithm.
>> 
>> This is what was implemented:
>> 
>> When the app does a write:
>> 
>>    if (write buffer is not empty) {
>>        if (adding pkt to buffer will make the total bytes buffered
>> more than a configured max value) {
>>          block till buffer becomes available, then attempt a write again
>>        } else {
>>           add pkt to buffer and return
>>        }
>>    } else {
>>        if (window is available) {
>>            send pkt and return
>>        } else {
>>            add pkt to write buffer and return
>>        }
>>    }
>> 
>> 
>> and when an ack clears some window:
>> 
>>   if (there are pkts in write buffer) {
>>       remove and send pkts from buffer upto available window;
>>       wake up any threads waiting for buffer to become available
>>    }
>> 
>> 
>> 
>> On Wed, Jul 11, 2012 at 7:40 PM, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:
>>> Haris just raised an interesting point on IRC.  He wants to limit the 
>>> amount of data sent on TCP write, which is normally done via 
>>> TCP.Pcb.write_wait_for <bytes>.
>>> 
>>> This function calls the User_buffer.TX, which clamps it to the TX transmit 
>>> window, which we ideally want to keep full at all times.
>>> 
>>> Do we need another function to wait for space to be free in the 
>>> TX.User_buffer (if it's greater than max_size), which will more closely 
>>> match the UNIX write(2) semantics?  That is, write works until the transmit 
>>> buffer is full, providing the TCP stack with more leeway to schedule 
>>> on-the-wire packets.
>>> 
>>> -anil
>> 
> 
> 
> 
> -- 
> Charalampos Rotsos
> PhD student
> The University of Cambridge
> Computer Laboratory
> William Gates Building
> JJ Thomson Avenue
> Cambridge
> CB3 0FD
> 
> Phone: +44-(0) 1223 767032
> Email: cr409@xxxxxxxxxxxx
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.