[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Block ring protocol (segment expansion, multi-page, etc).
Please correct me if I a got something wrong. About two or three years ago Citrix (and Red Hat I think?) posted a multi-page extension protocol (max-ring-page-order, max-ring-pages and ring-page-order and ring-pages)- which never got upstream (needed just to be rebased on the driver that went in the kernel I think?). Then about a year ago SpectraLogic started enhancing the FreeBSD variant of blkback - and realized what Ronghui also did - that the just doing a multi-page extension is not enough. The issue was that if one just expanded to a ring composed of two pages, 1/4 of the page was wasted b/c of the segment is constrained to 11. Justin (SpectraLogic) came up with a protocol enh were the existing blkif protocol is the same, but the BLKIF_MAX_SEGMENTS_PER_REQUEST is negotitated via max-request-segments. And then there is the max-request-size which rolls the segment size and the size of the ring to give you an idea of what is the biggest I/O you can fit on a ring in a single transaction. This solves the wastage problem and expands the ring. Ronghui did something similar, but instead of re-using the existing blkif structure he split them in two. One ring is for blkif_request_header (which has the segments ripped out), and the other is for just for blkif_request_segments. Solves the wastage and also allows to expand the ring. The three major outstanding issues that exists with the current protocol that I know of are: - We split up the I/O requests. This ends up eating a lot of CPU cycles. - We might have huge I/O requests. Justin mentioned 1MB single I/Os - and to fit that on a ring it has to be .. well, be able to fit 256 segments. Jan mentioned 256kB for SCSI - since the protocol extensions here could very well be carried over. - concurrent usage. If we have more than 4 VBDs blkback suffers when it tries to get a page as there is a "global" pool shared across all guests instead of being something 'per guest' or 'per VBD'. So.. Ronghui - I am curious to why you choosen the path of making two seperate rings? Was the mechanism that Justin came up not really that good or was this just easier to implement? Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |