[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Getting xen to recognise large disks



On Tue, Nov 21, 2006 at 10:41:41PM +0000, Daniel P. Berrange wrote:
> On Tue, Nov 21, 2006 at 09:11:18PM +0000, Daniel P. Berrange wrote:
> > On Tue, Nov 21, 2006 at 11:34:45AM +0000, Keir Fraser wrote:
> > > On 21/11/06 11:21, "Robin Bowes" <robin-lists@xxxxxxxxxxxxxx> wrote:
> > > 
> > > > Keir Fraser wrote:
> > > >> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@xxxxxxxxxxxxxx> wrote:
> > > >> 
> > > >> I'll make a patch today.
> > > >> 
> > > > 
> > > > Thanks Keir, looking forward to testing it.
> > > 
> > > If you don't mind using the xen-unstable source repository, it's changeset
> > > 12496:0c0ef61de06b. It probably hasn't reached the public repository just
> > > yet (should very shortly though).
> > 
> > I've tested that changeset with the following
> > 
> >  - phy:  against a 5 TB partition
> >  - file: against a 7.3 TB file
> > 
> > In both cases the # of sectors matches in Dom0 vs DomU. For good measure
> > I also ran Stephen Tweedie's verify-data tool in the DomU to verify no
> > data I/O wraparound issues elsewhere in the code & it passed without
> > trouble.
> > 
> > Blktap, however, is a different story - it is showing wraparound for disk
> > size at the 2 TB size mark stil. The userspace blktap tools have totally
> > inconsistent data types. Sometimes using int, sometimes long, sometimes
> > unsigned long & sometimes uint64. I'm working on a patch which makes it 
> > 
> >  - 'unsigned long long'  for # sectors
> >  - 'unsigned long'       for sector size
> >  - 'unsigned int'        for info
> > 
> > This makes it match the data types used in blkfront/blkback exactly.
> > With this patch applied, the DomU sees correct disk size, however,
> > the verify-data tool is showing nasty data consistency issues when
> > writing/reading to such a disk. So I think there is 32-bit wrap
> > around somewhere in the I/O codepath for blktap. I'll get back when
> > I've found out more info...
> 
> It turns out that blktap wasn't (directly) at fault here. I was storing my
> file based disk images on an XFS formatted partition in the host. Well it
> appears that XFS doesn't play nice with the async I/O + O_DIRECT options
> that blktap likes so all your data goes to /dev/null :-)
> 
> I re-tested blktap + large file backed disks on ext3 & GFS and everything
> is working as expected. So stay away from a XFS+blktap combo if you like 
> your data :-)

FYI, in case anyone else out there is reading the archives..it turns out
there is a kernel bug which caused the data corruption problems with XFS
in this case - it wasn't a xen or blktap issue. The root cause was that
if you used O_DIRECT + async-IO on a sparse file, XFS ended up writing
data into the wrong region of the file! So if you're using XFS for storing
file backed images, make sure they're not sparse images, or use the old
loopback driver which avoids the O_DIRECT+AIO codepaths. Gory details in

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.