Xen project Mailing List

Re: [Xen-API] How snapshot work on LVMoISCS SR

To: Anthony Xu <anthony@xxxxxxxxx>

From: Julian Chesterfield <julian.chesterfield@xxxxxxxxxxxxx>

Date: Wed, 27 Jan 2010 11:03:24 +0000

Cc: "xen-api@xxxxxxxxxxxxxxxxxxx" <xen-api@xxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 27 Jan 2010 03:16:24 -0800

List-id: Discussion of API issues surrounding Xen <xen-api.lists.xensource.com>

The GC code is included in the distro. It is the

/opt/xensource/sm/cleanup.py module. Kick off a manual SR scan (xe

sr-scan uuid=<SR UUID>) and then check in the log file (/var/log/SMlog)

to see what is happenning. You'll see some lines such as:

[23271] 2010-01-27 10:59:13.769991 LVHDSR.scan for 2cfaea72-8772-1c00-6e47-879140fa1db2 ............ [23271] 2010-01-27 10:59:13.824614 Kicking GC <23271> 2010-01-27 10:59:13.824690 === SR 2cfaea72-8772-1c00-6e47-879140fa1db2: gc === ........... <23826> 2010-01-27 11:01:02.741333 -- SR 2cfa ('Local storage') has 3 VDIs (1 VHD trees) -- <23826> 2010-01-27 11:01:02.741496 *3dd5c107[VHD](1.00G//8.00M|n) <23826> 2010-01-27 11:01:02.741574 d3a89874[VHD](1.00G//8.00M|n) <23826> 2010-01-27 11:01:02.741651 2f6a87b8[VHD](1.00G//1.01G|n)

and then the GC should proceed depending on whether there are

coalescable nodes.

- Julian Anthony Xu wrote:

Hi Julian,

Thanks for your detailed explanation,

I'd like to have GC, I'm using XenServer 5.5, seems there is no GC
running background, I just checked there are still 16 vhd chain after
one night, Where can I get GC for XenServer 5.5?

Anthony




On Tue, 2010-01-26 at 02:34 -0800, Julian Chesterfield wrote:

Hi Anthony,

Anthony Xu wrote:
Hi all,

Basically snapshot on LVMoISCSI SR work well, it provides thin
provisioning, so it is fast and disk space efficient.


But I still have below concern.

There is one more vhd chain when creating snapshot, if I creates 16
snapshots, there are 16 vhd chains, that means when one VM accesses a
disk block, it may need to access 16 vhd lvm one by one, then get the
right block, it makes VM access disk slow. However, it is
understandable, it is part of snapshot IMO.
The depth and speed of access will depend on the write pattern to thedisk. In XCP we add an optimisation called a BATmap which stores one bitper BAT entry. This is a fast lookup table that is cached in memorywhile the VHD is open, and tells the block device handler whether ablock has been fully allocated. Once the block is fully allocated (alllogical 2MB written) the block handler knows that it doesn't need toread or write the Bitmap that corresponds to the data block, it can godirectly to the disk offset. Scanning through the VHD chain cantherefore be very quick, i.e. the block handler reads down the chain ofBAT tables for each node until it detects a node that is allocated withhopefully the BATmap value set. The worst case is a random disk writeworkload which causes the disk to be fragmented and partially allocated.Every read or write will therefore potentially incur a bitmap check atevery level of the chain.
But after I delete all these 16 snapshots, there is still 16 vhd chains,
the disk access is still slow, which is not understandable and
reasonable, even though there may be only several KB difference between
each snapshot,
There is a mechanism in XCP called the GC coalesce thread which getskicked asynchronously following a VDI deletion event. It queries the VHDtree, and determines whether there is any coalescable work to do.Coalesceable work is defined as:
'a hidden child node that has no siblings'
Hidden nodes are non-leaf nodes that reside within a chain. When thesnapshot leaf node is deleted therefore, it will leave redundant linksin the chain that can be safely coalesced. You can kick off a coalesceby issuing an SR scan, although it should kick off automatically within30 seconds of deleting the snapshot node, handled by XAPI. If you lookin the /var/log/SMlog file you'll see a lot of debug informationincluding tree dependencies which will tell you a) whether the GC threadis running, and b) whether there is coalescable work to do. Note thatdeleting snapshot nodes does not always mean that there is coalescablework to do since there may be other siblings, e.g. VDI clones.
is there any way we can reduce depth of vhd chain after deleting
snapshots? get VM back to normal disk performance.
The coalesce thread handles this, see above.
And, I notice there are useless vhd volume exist after deleting snap
shots, can we delete them automatically?
No. I do not recommend deleting VHDs manually since they are almostcertainly referenced by something else in the chain. If you delete themmanually you will break the chain, it will become unreadable, and youpotentially lose critical data. VHD chains must be correctly coalescedin order to maintain data integrity.
Thanks,
Julian
- Anthony




_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api

_______________________________________________ xen-api mailing list xen-api@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/mailman/listinfo/xen-api

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.