[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] Data integrity extension support for xen-block



On 08/04/16 03:24, Bob Liu wrote:
> 
> On 04/07/2016 11:55 PM, Juergen Gross wrote:
>> On 07/04/16 12:00, Bob Liu wrote:
>>> * What's data integrity extension and why?
>>> Modern filesystems feature checksumming of data and metadata to protect 
>>> against
>>> data corruption.  However, the detection of the corruption is done at read 
>>> time
>>> which could potentially be months after the data was written.  At that 
>>> point the
>>> original data that the application tried to write is most likely lost.
>>>
>>> The solution in Linux is the data integrity framework which enables 
>>> protection
>>> information to be pinned to I/Os and sent to/received from controllers that
>>> support it. struct bio has been extended with a pointer to a struct bip 
>>> which
>>> in turn contains the integrity metadata. The bip is essentially a trimmed 
>>> down
>>> bio with a bio_vec and some housekeeping.
>>>
>>> * Issues when xen-block get involved.
>>> xen-blkfront only transmits the normal data of struct bio while the 
>>> integrity
>>> metadata buffer(struct bio_integrity_payload in each bio) is ignored.
>>>
>>> * Proposal of transmitting bio integrity payload.
>>> Adding an extra request following the normal data request, this extra 
>>> request
>>> contains the integrity payload.
>>> The xen-blkback will reconstruct an new bio with both received normal data 
>>> and
>>> integrity metadata.
>>>
>>> Welcome any better ideas, thank you!
>>>
>>> [1] http://lwn.net/Articles/280023/
>>> [2] https://www.kernel.org/doc/Documentation/block/data-integrity.txt
>>>
>>> Signed-off-by: Bob Liu <bob.liu@xxxxxxxxxx>
>>> ---
>>>  xen/include/public/io/blkif.h |   50 
>>> +++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 50 insertions(+)
>>>
>>> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
>>> index 99f0326..3d8d39f 100644
>>> --- a/xen/include/public/io/blkif.h
>>> +++ b/xen/include/public/io/blkif.h
>>> @@ -635,6 +635,28 @@
>>>  #define BLKIF_OP_INDIRECT          6
>>>  
>>>  /*
>>> + * Recognized only if "feature-extra-request" is present in backend xenbus 
>>> info.
>>> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is 
>>> followed
>>> + * in the shared ring buffer.
>>> + *
>>> + * By this way, extra data like bio integrity payload can be transmitted 
>>> from
>>> + * frontend to backend.
>>> + *
>>> + * The 'wire' format is like:
>>> + *  Request 1: xen_blkif_request
>>> + * [Request 2: xen_blkif_extra_request]    (only if request 1 has 
>>> BLKIF_OP_EXTRA_FLAG)
>>> + *  Request 3: xen_blkif_request
>>> + *  Request 4: xen_blkif_request
>>> + * [Request 5: xen_blkif_extra_request]    (only if request 4 has 
>>> BLKIF_OP_EXTRA_FLAG)
>>> + *  ...
>>> + *  Request N: xen_blkif_request
>>> + *
>>> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not* 
>>> create the
>>> + * "feature-extra-request" node!
>>> + */
>>> +#define BLKIF_OP_EXTRA_FLAG (0x80)
>>> +
>>> +/*
>>>   * Maximum scatter/gather segments per request.
>>>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>>>   * NB. This could be 12 if the ring indexes weren't stored in the same 
>>> page.
>>> @@ -703,6 +725,34 @@ struct blkif_request_indirect {
>>>  };
>>>  typedef struct blkif_request_indirect blkif_request_indirect_t;
>>>  
>>> +enum blkif_extra_request_type {
>>> +   BLKIF_EXTRA_TYPE_DIX = 1,               /* Data integrity extension 
>>> payload.  */
>>> +};
>>> +
>>> +struct bio_integrity_req {
>>> +   /*
>>> +    * Grant mapping for transmitting bio integrity payload to backend.
>>> +    */
>>> +   grant_ref_t *gref;
>>> +   unsigned int nr_grefs;
>>> +   unsigned int len;
>>> +};
>>
>> How does the payload look like? It's structure should be defined here
>> or a reference to it's definition in case it is a standard should be
>> given.
>>
> 
> The payload is also described using struct bio_vec(the same as bio).
> 
> /*
>  * bio integrity payload
>  */
> struct bio_integrity_payload {
>       struct bio              *bip_bio;       /* parent bio */

And e.g. Windows guests know how those look like? And BSDs? And others?
All Linux versions with all possible kernel configurations have the
same layout?

I don't think so.

> 
>       struct bvec_iter        bip_iter;
> 
>       bio_end_io_t            *bip_end_io;    /* saved I/O completion fn */
> 
>       unsigned short          bip_slab;       /* slab the bip came from */
>       unsigned short          bip_vcnt;       /* # of integrity bio_vecs */
>       unsigned short          bip_max_vcnt;   /* integrity bio_vec slots */
>       unsigned short          bip_flags;      /* control flags */
> 
>       struct work_struct      bip_work;       /* I/O completion */
> 
>       struct bio_vec          *bip_vec;
>       struct bio_vec          bip_inline_vecs[0];/* embedded bvec array */
> };
> 
>>> +
>>> +/*
>>> + * Extra request, must follow a normal-request and a normal-request can
>>> + * only be followed by one extra request.
>>> + */
>>
>> Wouldn't it be possible to include this in the original request (e.g.
>> it could be the first or last indirect segment) ?
>>
> 
> Yes, that's also an option.
> 
> The common way for block layer/driver to handle integrity metadata is using 
> two scatterlists.
> One containing the data as usual, and one containing the integrity metadata.
> The block driver should transmit both two scatterlists.
> 
> I'm worry about you may think embedding the metadata scatterlist in the 
> original request is too hacky.

I think you have to specify a clean OS agnostic interface for the
integrity data first. Then we can see how it fits into the request.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.