[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [PATCH RFC 3/3] Virtio draft III: example block driver



Rusty Russell wrote:
On Mon, 2007-06-18 at 12:09 +0300, Avi Kivity wrote:
Rusty Russell wrote:
On Sun, 2007-06-17 at 17:25 +0300, Avi Kivity wrote:
Rusty Russell wrote:
+       /* Set up for reply. */
+       vblk->sg[0].page = virt_to_page(&vbr->in_hdr);
+       vblk->sg[0].offset = offset_in_page(&vbr->in_hdr);
+       vblk->sg[0].length = sizeof(vbr->in_hdr);
+       num = blk_rq_map_sg(q, vbr->req, vblk->sg+1);
+       vbr->out_hdr.id = vblk->vdev->ops->add_inbuf(vblk->vdev, vblk->sg,
+                                                    1+num, vbr);
+       if (IS_ERR_VALUE(vbr->out_hdr.id))
+               goto full;
+
+       vblk->sg[0].page = virt_to_page(&vbr->out_hdr);
+       vblk->sg[0].offset = offset_in_page(&vbr->out_hdr);
+       vblk->sg[0].length = sizeof(vbr->out_hdr);
+
+       vbr->out_id = vblk->vdev->ops->add_outbuf(vblk->vdev, vblk->sg, 1,
+                                                 vbr);
This strikes me as wasteful. Why not set up a single descriptor which contains both placement and the data itself?
We could actually do this for write, but not for read (where the length
& sector need to be sent to other end, and the data comes from other
end).
So you map the first sg entry for output, and the rest for input.

Less pretty, but more efficient.

At the moment there are two kinds of sgs: input and output.  You're
proposing instead that we have a single kind, and a flag on each segment
which indicates direction.  That may make block a little efficient,
depending on particular implementation of virtio.

Yet network really wants in and out separate.  I'm not even sure how
we'd implement a mixed-sg scheme efficiently for that case.  At the
moment it's very straightforward.


I think the differences are deeper than that. Network wants two queues, disk wants one queue. The directionality of the dma transfer is orthogonal to that.

For a disk, the initiator is always the guest; you issue a request and expend a completion within a few milliseconds. Read and write requests are similar in nature; you want them in a single queue because you want the elevator to run on both read and write request. When idling, the queue is empty.

For a network interface, the guest initiates transmits, but the host initiates receives. Yes, the guest preloads the queue with rx descriptors, but it can't expect a response in any given time, hence the need for two queues. When idling, the tx queue is empty, the rx queue is full.

So I'd suggest having a struct virtio_queue to represent a queue (so a network interface would have two of them), and the dma direction to be specified separately:

+       unsigned long (*add_buf)(struct virtio_queue *q,
+                                const struct scatterlist out_sg[], int out_nr,
+                                const struct scatterlist in_sg[], int in_nr,
+                                void *data);


I think this fits better with the Xen model (which has one ring for block, two each for network and console, if memory serves).

Of course, I'm carrying two implementations and two drivers, so it's
natural for me to try to resist changes 8

Life is rough on the frontier.  But all the glory will be yours!

--
error compiling committee.c: too many arguments to function


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.