[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Understanding sparse-files



The over-committed disk space can also be a double-edged sword.  While it allows you to commit disk space you don't actually have, if you're creating a lot of domUs with a lot of sparse disk files, you can easily lose track of how much data you actually do have on the filesystem vs. how much disk space you've allocated.  So, if you have a 10 GB filesystem, and you create 4 x 5GB sparse files, it doesn't take much use of each of those sparse files to run out that 10GB without realizing what you've done.  It is a pro, but you have to make sure that you're somehow keeping track of how much disk space you're actually using.  If you pre-allocate the disk files, you'll know fairly easily from the df command when your filesystem is close to capacity or completely out of space.

-Nick


-----Original Message-----
From: John Haxby <john.haxby@xxxxxxxxxx>
To: Rustedt, Florian <Florian.Rustedt@xxxxxxxxxxx>
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Understanding sparse-files
Date: Tue, 16 Dec 2008 14:37:33 +0000

Rustedt, Florian wrote:
> What exactly is the advantage of sparse-files against "normal" files
> with fixed length?
>
>   
There are both advantages and disadvantages.

> First i thought this is something like an auto-increasing file. But if i
> take a 2GB partition and add two sparse-files with 1GB each, i can't add
> an additional one, the disk is full?
>
>   
No, that's not it.

> So what about this mystic advantage? Is it only the faster creation of
> that file with dd, because it is not completely filled?
> That's all?
>   

If you create yourself a nice big sparse file like this

    dd bs=1M seek=10240 count=0 if=/dev/zero of=huge

And then look at what you've got with "ls -lh" you'll see you have a 10G 
file that was created almost instantly.  On the other hand, "ls -sh" 
will show that the file is actually occupying no space at all (well, 
almost no space).  You can make this file bigger like this:

    dd bs=1M seek=20480 count=0 if=/dev/zero of=huge

and this will make it 20GB and still not occupying much space.

I suspect you already know this, but if you didn't, you do now :-)

The advantage of this 20GB file is precisely that it occupies next to no 
space on the disk that holds it.  I can start writing data into it (that 
is, use it a a guest's disk) and the blocks needed will be allocated as 
they are used.  In fact, I could have a 200GB guest disk image even 
though the disk I have at the moment is only 120GB and I'm using quite a 
lot of it -- it would only be a problem if the guest actually wanted to 
use all that space.

There are some problems with sparse files: the compress beautfully (gzip 
reports 99.9%) but it takes a while to read the empty space and when you 
uncompress the file you discover that it now actually occupies disk 
space: there's no good way to distinguish between an unallocated block 
and a block full of zeroes.   This also means that you need to be 
careful how you back these files up: you need something a little 
cleverer than gzip.

Another problem with sparse files, especially when using them as domU 
disks is that blocks that are contiguous in the file are not contiguous 
on the disk.   That means if, in the guest, if you just "dd if=/dev/xvda 
of=/dev/null" then domU will be seeking back and forth all over the 
place to return the blocks in the order that they're being asked for.   
You don't need xen for this --  when I downloaded the DVD image of 
Fedora 10 using transmission (a bittorrent client) a checksum on the 
resulting file only managed to read it at about 4MB/s.  On the other 
hand, when I copied the file the checksum on the copy ran at closer to 
100MB/s -- bittorrent clients like transmission really ought to 
pre-allocate the disk space to that you get something contiguous and 
also not embarrassingly run out of space half way through.

In a nutshell, though:

pros: over-committed disk space

cons: performance

jch





This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.