[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-users] AoE (Was: iscsi vs nfs for xen VMs)

  • To: Simon Hobson <linux@xxxxxxxxxxxxxxxx>, <xen-users@xxxxxxxxxxxxxxxxxxx>
  • From: Jeff Sturm <jeff.sturm@xxxxxxxxxx>
  • Date: Thu, 27 Jan 2011 18:52:36 -0500
  • Cc:
  • Delivery-date: Thu, 27 Jan 2011 15:53:38 -0800
  • List-id: Xen user discussion <xen-users.lists.xensource.com>
  • Thread-index: Acu+O/c+dWGuTCElT7+OEMCVh5x8zQADpZqg
  • Thread-topic: [Xen-users] AoE (Was: iscsi vs nfs for xen VMs)

> -----Original Message-----
> From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-
> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Simon Hobson
> Subject: Re: [Xen-users] AoE (Was: iscsi vs nfs for xen VMs)
> Getting somewhat off-topic, but I'm interested to know how AoE handles
> errors ? I assume there is some handshake to make sure packets were
> rather than just "fire and forget" !

The Linux aoe open-source driver from Coraid (with which I am the most
familiar) implements a congestion avoidance and control algorithm,
similar to TCP/IP.  If a response exceeds twice the average round-trip
time plus 8 times the average deviation, the request is retransmitted
(based on aoe6-75 sources, earlier sources may differ).

What's interesting about aoe vs. TCP is that a round-trip measures both
network and disk latency, not just network latency.  A request request
will send a request packet, after which the target performs a disk read,
and returns a response packet with the disk sector contents.  A normal
write request will send a request with the sector contents, upon which
the target performs a disk write, and returns a status packet.  Disk
latency is orders of magnitude greater than network, and more variable.
We see a RTT of 5-10ms typically under light usage.

Upon heavy disk I/O, this time can vary upwards, possibly tenths of
seconds, leading to apparent packet loss and an RTT adjustment by the
driver.  So it's not uncommon for a target to receive and process a
duplicate request, which is okay because each request is idempotent.

Lossage of 0.1% to 0.2% is common in our environment, but this does not
have a significant impact overall on aoe performance.

That said, the aoe protocol also supports an asynchronous write
operation, which I suppose really is "fire and forget", unlike normal
reads and writes.  I haven't used an aoe driver that implements
asynchronous writes however, and I'm not sure I would if I had the
option since you have no guarantee that the writes succeed.


Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.