[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands



On 29/02/16 16:05, Konrad Rzeszutek Wilk wrote:
> On Mon, Feb 29, 2016 at 09:12:30AM +0100, Juergen Gross wrote:
>> On 29/02/16 04:37, Bob Liu wrote:
>>> 1) What is this patch about?
>>> This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
>>> A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
>>> extra request which is used to pass through SCSI commands.
>>> This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
>>> It can be extended easily to transmit other per-request/bio data from 
>>> frontend
>>> to backend e.g Data Integrity Field per bio.
>>>
>>> 2) Why we need this?
>>> Currently only raw data segments are transmitted from blkfront to blkback, 
>>> which
>>> means some advanced features are lost.
>>>  * Guest knows nothing about features of the real backend storage.
>>>     For example, on bare-metal environment INQUIRY SCSI command can be used
>>>     to query storage device information. If it's a SSD or flash device we
>>>     can have the option to use the device as a fast cache.
>>>     But this can't happen in current domU guests, because blkfront only
>>>     knows it's just a normal virtual disk
>>>
>>>  * Failover Clusters in Windows
>>>     Failover clusters require SCSI-3 persistent reservation target disks,
>>>     but now this can't work in domU.
>>>
>>> 3) Known issues:
>>>  * Security issues, how to 'validate' this extra request payload.
>>>    E.g SCSI operates on LUN bases (the whole disk) while we really just 
>>> want to
>>>    operate on partitions
>>
>> It's not only validation: some operations just affect the whole LUN
>> (e.g. Reserve/Release). And what about "multi-LUN" commands like
>> "report LUNs"?
> 
> Don't expose them. Bob and I want to get an idea of what would be a good
> compromise to allow some SCSI specific (or perhaps ATA specific or DIF/DIX?) 
> type of
> commands go through the PV driver.
> 
> Would it be better if it was through XenBus? But that may not work for some
> that are tied closely to requests, such as DIF/DIX.
> 
> However the 'DISCARD' for example worked out - it is an umbrella for both
> SCSI UNMAP and ATA DISCARD operation and hides the complexity of the low level
> protocol. Could there be an 'INQ' ? Since the SCSI VPD is the most exhaustive
> in terms of details it may make sense to base it on that..?
> 
>>
>>>  * Can't pass SCSI commands through if the backend storage driver is 
>>> bio-based
>>>    instead of request-based.
>>>
>>> 4) Alternative approach: Using PVSCSI instead:
>>>  * Doubt PVSCSI can support as many type of backend storage devices as 
>>> Xen-block.
>>
>> pvSCSI won't need to support all types of backends. It's enough to
>> support those where passing through SCSI commands makes sense.
>>
>> Seems to be a similar issue as the above mentioned problem with
>> bio-based backend storage drivers.
> 
> In particular the ones we care about are:
>  - Multipath over FibreChannel devices.
>  - Linear mapping (LVM) over the multipath.
>  - And then potentially an filesystem on top of that
>  - .. and a raw file on the filesystem.
> 
> Having SCSI VPD 0x83 page sent to frontend for that would be good.
> 
> Not sure about SCSI reservations. Perhaps those are more of .. unique
> in that the implementation would have to make sure that the guest
> owns the whole LUN. But that is implementation question.
> 
> This is about the design - how would you envision to to cram in 
> SCSI commands or DIF/DIX commands or ATA commands via PV block layer?

Have some kind of abstraction which can be mapped to SCSI commands
easily, but don't stick to the SCSI definitions. Use your own
structures, commands etc. and build SCSI commands from those in the
backend. This way you avoid having to emulate SCSI. Instead of
naming it "SCSI passthrough" call it "special commands". :-)
I would add only those operations you really need. Add an inquiry
operation which returns the supported "special commands". The
availability of the inquiry can be reported via Xenstore.

> 
>>
>>>  * Much longer path:
>>>    ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
>>> PVSCSI-backend -> Target framework(LIO?) ->
>>>
>>>    With xen-block we only need:
>>>    ioctl() -> blkfront -> blkback ->
>>
>> I'd like to see performance numbers before making a decision.
> 
> For SCSI INQ? <laughs>
> 
> Or are you talking about raw READ/WRITE?

READ/WRITE. I'm quite sure pvSCSI will be slower than pvblk, but I don't
know how much slower.

>>
>>>  * xen-block has been existed for many years, widely used and more stable.
>>
>> Adding another SCSI passthrough capability wasn't accepted for pvSCSI
>> (that's the reason I used the Target Framework). Why do you think it
>> will be accepted for pvblk?
>>
>> This is not my personal opinion, just a heads up from someone who had a
>> try already. ;-)
> 
> Right. So SCSI passthrough out. What about mediated access for
> specific SCSI, or specific ATA, or DIF/DIX ones? And how would you do it
> knowing the SCSI maintainers much better than we do?

Don't call it SCSI command passthrough, or use the target framework.
As stated above: I think renaming the whole feature is better. You'll
avoid much trouble and weird configuration problems.

Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.