[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 00/10] Multi-queue support for xen-block driver

On 02/19/2015 02:22 AM, Felipe Franciosi wrote:
>> -----Original Message-----
>> From: Bob Liu [mailto:bob.liu@xxxxxxxxxx]
>> Sent: 15 February 2015 08:19
>> To: xen-devel@xxxxxxxxxxxxx
>> Cc: David Vrabel; linux-kernel@xxxxxxxxxxxxxxx; Roger Pau Monne;
>> konrad.wilk@xxxxxxxxxx; Felipe Franciosi; axboe@xxxxxx; hch@xxxxxxxxxxxxx;
>> avanzini.arianna@xxxxxxxxx; Bob Liu
>> Subject: [RFC PATCH 00/10] Multi-queue support for xen-block driver
>> This patchset convert the Xen PV block driver to the multi-queue block layer 
>> API
>> by sharing and using multiple I/O rings between the frontend and backend.
>> History:
>> It's based on the result of Arianna's internship for GNOME's Outreach Program
>> for Women, in which she was mentored by Konrad Rzeszutek Wilk. I also
>> worked on this patchset with her at that time, and now fully take over this 
>> task.
>> I've got her authorization to "change authorship or SoB to the patches as you
>> like."
>> A few words on block multi-queue layer:
>> Multi-queue block layer improved block scalability a lot by split single 
>> request
>> queue to per-processor software queues and hardware dispatch queues. The
>> linux blk-mq API will handle software queues, while specific block driver 
>> must
>> deal with hardware queues.
> IIUC, the main motivation around the blk-mq work was around locking issues on 
> a block device's request queue when accessed concurrently from different NUMA 
> nodes. I believe we are not stressing enough on the main benefit of taking 
> such approach on Xen.
> Many modern storage systems (e.g. NVMe devices) will respond much better 
> (especially when it comes to IOPS) to a high number of outstanding requests. 
> That can be achieved by having a single thread sustaining a high IO depth 
> _and/or_ several different threads issuing requests at the same time. The 
> former approach is often limited by CPU capacity; that is, we can suffer from 
> only being able to handle so many interrupts being delivered to the (v)CPU 
> that the single thread is running on (also simply observable by 'top' showing 
> the thread smoking at 100%). The latter approach is more flexible, given that 
> many threads can run over several different (v)CPUs. I have a lot of data 
> around this topic and am happy to share if people are interested.
> We can therefore use the multi-queue block layer in a guest to have more than 
> one request queue associated with block front. These can be mapped over 
> several rings to the backend, making it very easy for us to run multiple 
> threads on the backend for a single virtual disk. I believe this is why Bob 
> is seeing massive improvements when running 'fio' in a guest with an 
> increased number of jobs.

Yes, exactly. I will add this information to the commit log.


> In my opinion, this motivation should be highlighted behind the blk-mq 
> adoption by Xen.
> Thanks,
> Felipe
>> The xen/block implementation:
>> 1) Convert to blk-mq api with only one hardware queue.
>> 2) Use more rings to act as multi hardware queues.
>> 3) Negotiate number of hardware queues, the same as xen-net driver. The
>> backend notify "multi-queue-max-queues" to frontend, then the front write
>> back final number to "multi-queue-num-queues".
>> Test result:
>> fio's IOmeter emulation on a 16 cpus domU with a null_blk device, hardware
>> queue number was 16.
>> nr_fio_jobs      IOPS(before)   IOPS(after)    Diff
>>      1                 57k             58k       0%
>>      4                 95k            201k    +210%
>>      8                 89k            372k    +410%
>>        16                 68k            284k    +410%
>>        32                 65k            196k    +300%
>>        64                 63k            183k    +290%
>> More results are coming, there was also big improvement on both write-IOPS
>> and latency.
>> Any comments or suggestions are welcome.
>> Thank you,
>> -Bob Liu
>> Bob Liu (10):
>>   xen/blkfront: convert to blk-mq API
>>   xen/blkfront: drop legacy block layer support
>>   xen/blkfront: reorg info->io_lock after using blk-mq API
>>   xen/blkfront: separate ring information to an new struct
>>   xen/blkback: separate ring information out of struct xen_blkif
>>   xen/blkfront: pseudo support for multi hardware queues
>>   xen/blkback: pseudo support for multi hardware queues
>>   xen/blkfront: negotiate hardware queue number with backend
>>   xen/blkback: get hardware queue number from blkfront
>>   xen/blkfront: use work queue to fast blkif interrupt return
>>  drivers/block/xen-blkback/blkback.c | 370 ++++++++-------  
>> drivers/block/xen-
>> blkback/common.h  |  54 ++-  drivers/block/xen-blkback/xenbus.c  | 415
>> +++++++++++------
>>  drivers/block/xen-blkfront.c        | 894 
>> +++++++++++++++++++++---------------
>>  4 files changed, 1018 insertions(+), 715 deletions(-)
>> --

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.