[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2 linux-next] Revert "ufs: fix deadlocks introduced by sb mutex merge"



On Tue, Jun 23, 2015 at 06:46:08PM +0200, Jan Kara wrote:

> Looks good to me. BTW also ext4 (with BIGALLOC feature) and OCFS2 can have
> block allocation unit (called cluster) larger than page size. However the
> block size of both filesystems is still <= page size. So at least ext4
> handles fun with partially initialized clusters by just marking parts
> of the cluster as uninitialized in the extent tree. But the code is still
> pretty messy to be honest.

Well, with UFS there's no place on disk to store such "this block is
uninitialized" marks - it uses a bog-standard Unix inode structure.

There are two units - fragments and blocks.  Block is an aligned group
of adjacent fragments; normal ratio is 8:1.  Block is at least 4Kb
(and always a power of two), fragment is at least a one sector and
block:fragment ratio is at most 8:1.  Inode structure is normal for a Unix
filesystem (12 direct + indirect + double indirect + triple indirect).
Each reference covers a block worth of file offsets and almost all
of them point to full blocks.  Indirects are full blocks as well.
Reference to a block is represented as the number of the first fragment
in it (i.e. with normal parameters bits 0..2 are clear).  Block bitmap is
actually a fragment bitmap (i.e. bit per fragment).  The only situation
when a reference is *not* to a full block is the last reference in a
file shorter than 12*block size (i.e. not requiring indirects at all).
In that case the last direct reference points to less than a full block
(unless the size in fragments is a multiple of block:fragment ratio,
that is).  One unusual thing is that holes can't extend to EOF - the last
byte *must* be allocated.  (BTW, the only difference between UFS2 and
UFS1 in that area is that fragment numbers are 64bit now.  There had been
talk about turning block:fragment ratio into a per-inode value, but so far
nobody has implemented that - ->di_blksize is there, but it's never used).

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.