[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TCP wait_for transmit question

Looking at the code there is the following logic:

an application will call Channel.write_buffer or write_string to write
data to the socket:

let write_buffer t buf =
  queue_obuf t;
  t.obufq <- buf :: t.obufq

(* Queue the active write buffer onto the write queue, resizing the
 * view if necessary to the correct size. *)
let queue_obuf t =
  match t.obuf with
  |None -> ()
  |Some buf when Cstruct.len buf = t.opos -> (* obuf is full *)
    t.obufq <- buf :: t.obufq;
    t.obuf <- None
  |Some buf when t.opos = 0 -> (* obuf wasnt ever used, so discard *)
    t.obuf <- None
  |Some buf -> (* partially filled obuf, so resize *)
    let buf = Cstruct.sub buf 0 t.opos in
    t.obufq <- buf :: t.obufq;
    t.obuf <- None

then in order to push the data out to the the networks the flush
method should be called:

let rec flush t =
  queue_obuf t;
  let l = List.rev t.obufq in
  t.obufq <- [];
  Flow.writev t.flow l

the blocking call in the closure is Flow.writev. which does the following:

 let writev t views =
   Tcp.Pcb.writev t views

let writev pcb data = User_buffer.Tx.write pcb.utx data

module Tx :
let write t datav =
  let l = lenv datav in
  let mss = Int32.of_int (Window.tx_mss t.wnd) in
  match Lwt_sequence.is_empty t.buffer &&
    (l = mss || not (Window.tx_inflight t.wnd)) with
  | false ->
      t.bufbytes <- Int32.add t.bufbytes l;
      List.iter (fun data -> ignore(Lwt_sequence.add_r data t.buffer))
      if t.bufbytes < mss then
        return ()
        clear_buffer t
  | true ->
      let avail_len = available_cwnd t in
      match avail_len < l with
      | true ->
          t.bufbytes <- Int32.add t.bufbytes l;
          List.iter (fun data -> ignore(Lwt_sequence.add_r data
t.buffer)) datav;
          return ()
      | false ->
          let max_size = Window.tx_mss t.wnd in
          transmit_segments ~mss:max_size ~txq:t.txq datav

so the most of the time the write will end up in the false case of the
first match. I nthis closure if I write constantly 1460 bytes packet
segments, then the call will never block right?

Maybe an additional blocking check on the buffer size should be added
in order to cover this case?

On 11 July 2012 14:06, Balraj Singh <balraj.singh@xxxxxxxxxxxx> wrote:
> This is what the write buffer is supposed to do and it definitely
> improves overall performance.  It allows the app and tcp to not be in
> lock step.  The app can dump a chunk of data to be sent in the write
> buffer, when tcp gets acks that open up the window it has data ready
> to be sent.  This buffer also needed to implement Nagle's algorithm.
> This is what was implemented:
> When the app does a write:
>     if (write buffer is not empty) {
>         if (adding pkt to buffer will make the total bytes buffered
> more than a configured max value) {
>           block till buffer becomes available, then attempt a write again
>         } else {
>            add pkt to buffer and return
>         }
>     } else {
>         if (window is available) {
>             send pkt and return
>         } else {
>             add pkt to write buffer and return
>         }
>     }
> and when an ack clears some window:
>    if (there are pkts in write buffer) {
>        remove and send pkts from buffer upto available window;
>        wake up any threads waiting for buffer to become available
>     }
> On Wed, Jul 11, 2012 at 7:40 PM, Anil Madhavapeddy <anil@xxxxxxxxxx> wrote:
>> Haris just raised an interesting point on IRC.  He wants to limit the amount 
>> of data sent on TCP write, which is normally done via TCP.Pcb.write_wait_for 
>> <bytes>.
>> This function calls the User_buffer.TX, which clamps it to the TX transmit 
>> window, which we ideally want to keep full at all times.
>> Do we need another function to wait for space to be free in the 
>> TX.User_buffer (if it's greater than max_size), which will more closely 
>> match the UNIX write(2) semantics?  That is, write works until the transmit 
>> buffer is full, providing the TCP stack with more leeway to schedule 
>> on-the-wire packets.
>> -anil

Charalampos Rotsos
PhD student
The University of Cambridge
Computer Laboratory
William Gates Building
JJ Thomson Avenue

Phone: +44-(0) 1223 767032
Email: cr409@xxxxxxxxxxxx



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.