[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] blkif: reconcile protocol specification with in-use implementations



On Wed, Sep 04, 2024 at 09:35:40AM +0000, Anthony PERARD wrote:
> On Wed, Sep 04, 2024 at 11:11:51AM +0200, Roger Pau Monné wrote:
> > On Wed, Sep 04, 2024 at 09:39:17AM +0100, Paul Durrant wrote:
> > > On 04/09/2024 09:21, Roger Pau Monné wrote:
> > > > > In the absence of that I'm afraid it is a little harder to
> > > > > judge whether the proposal here is the best we can do at this point.
> > > >
> > > > While I don't mind looking at what we can do to better handle 4K
> > > > sector disks, we need IMO to revert to the specification before
> > > > 67e1c050e36b, as that change switched the hardcoded sector based units
> > > > from 512 to 'sector-size', thus breaking the existing ABI.
> > > >
> > >
> > > But that's the crux of the problem. What *is* is the ABI? We apparently
> > > don't have one that all OS subscribe to.
> >
> > At least prior to 67e1c050e36b the specification in blkif.h and (what
> > I consider) the reference implementation in Linux blk{front,back}
> > matched.  Previous to 67e1c050e36b blkif.h stated:
> >
> > /*
> >  * NB. first_sect and last_sect in blkif_request_segment, as well as
> >  * sector_number in blkif_request, are always expressed in 512-byte units.
> >  * However they must be properly aligned to the real sector size of the
> >  * physical disk, which is reported in the "physical-sector-size" node in
> >  * the backend xenbus info. Also the xenbus "sectors" node is expressed in
> >  * 512-byte units.
> >  */
> >
> > I think it was quite clear, and does in fact match the implementation
> > in Linux.
> 
> That's wrong, Linux doesn't match the specification before 67e1c050e36b,
> in particular for "sectors":
> 
>     sectors
>          Values:         <uint64_t>
> 
>          The size of the backend device, expressed in units of its logical
>          sector size ("sector-size").

This was a bug introduced in 2fa701e5346d.  The 'random' comment that
you mention notes that 'sectors' is unconditionally expressed in
512-byte units was added way before, in d05ae13188231.  The improved
documentation added by 2fa701e5346d missed to correctly reflect the
units of the 'sectors' node.

> 
> The only implementation that matches this specification is MiniOS (and
> OMVF).
> 
> Oh, I didn't notice that that random comment you quoted that comes from
> the middle of the header have a different definition for "sectors" ...
> 
> Well, the specification doesn't match with the specification ... and the
> only possible way to implement the specification is to only ever set
> "sector-size" to 512...
> 
> No wonder that they are so many different interpretation of the
> protocol.

My opinion is that there was a bug introduced in the specification in
2fa701e5346d, and that bug was extended by 67e1c050e36b to even more
fields.

Implementations should be fixed to adhere to the specification as it
was pre 2fa701e5346d, because that works correctly with 'sector-size'
!= 512, and is the one implemented in Linux blkfront and blkback.

There's no need to make this more complicated than it is.  We
introduced bugs in blkif.h, and those need to be fixed.  It's sad that
those bugs propagated into implementations, or that bugs from
implementations propagated into blkif.h.

I don't see an option where we get to keep our current diverging
implementations and still support 4K logical sector disks without
specification and code changes.  We could introduce a new way to
signal 4K logical sector sizes, but as that will require modifications
to every frontends and backend we might as well just fix the existing
mess and modify the implementations as required.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.