[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Cheap IOMMU hardware and ECC support importance
On 07/06/2014 01:20 PM, lee wrote: What if I need to access a file that's in the snapshot: Do I need to restore the snapshot first?Usually you can "cd .zfs" directory, which contains subdirectories named after your snapshots, and inside that directories you have complete datasets just like the ones you took the snapshots of. No rollback/restoring/mounting is necessary.And that also works when the file system the snapshot was created from doesn't exist anymore, or when the disks with the FS the snapshot was made from have become inaccessible, provided that the snapshot was made to different disks?Oversimplifying: yes.So it's as good as a backup? What's the difference then? Is it like the difference between a picture and a picture? By your analogy, it might as well be like the difference between a disk and a photo of a disk. Yes, I take snapshots all the time. This way it's easy for me to revert VMs to previous states, clone them, etc. Same goes with my regular data. And I replicate them a lot.Hm, what for? The VMs I have are all different, so there's no point in cloning them. And why would I clone my data? I don't even have the disk capacity for that and am glad that I can make a backup.I tend to clone "production" VMs before I start fiddling with them, so that I can test potentially dangerous ideas without any consequences. Clones are "free" - they only start using more space when you introduce some difference between the clone and the original dataset. You can always 'promote' them so they become independent from the original dataset (using more space as required). Cloning is just a tool that you might or might not find useful.I see --- and I'd find that useful. I have the VMs in a LVM volume group with one logical volume for each VM. Each VM has two partitions, one for a root file system and another one for swap. How would that translate to ZFS? Where's this additional space taken from? From the pool. So you would be running ZFS on unreliable disks, with the errors being corrected and going unnoticed, until either, without TLER, the system goes down or, with TLER, until the errors aren't recoverable anymore and become noticeable only when it's too late.ZFS tells you it had problems ("zpool status"). ZFS can also check entire pool for defects ("zpool scrub", you should do that periodically).You're silently loosing more and more redundancy.I'm not sure what you mean by loosing redundancy.You don't know whether the data has been written correctly before you read it. The more errors there are, the more redundancy you loose because you have more data that can be read from only a part of the disks. If there is an error on another disk with that same data, you don't know until you try to read it and perhaps find out that you can't. How many errors for that data it takes depends on the level of redundancy.I don't understand your point here. Do you know all your data had been written correctly with any other form of RAID without reading it back?No, but I know that the raid controller does scrubbing and fails a disk eventually. There is no in-between like there seems to be with ZFS. I suspect most RAID controllers hide the in-between stuff. Software RAID exposes mismatches, on scrubs, but it's ability to fix them in n+1 redundancy cases is limited. My point is that you can silently loose redundancy with ZFS. RAID controllers aren't exactly known to silently loose redundancy, are they? Define "silently". I don't see any difference between the two cases. And how do you know when to replace a disk? When there's one error or when there are 50 or 50000 or when the disk has been disconnected?I believe it's up to you to interpret the data you're presented with and make the right decision. I really wish I could formulate a condition that evaluates to true or false telling me what should I do with a disk.RAID controllers make that easy for you --- not necessarily better, but easier. You mean by doing the deciding for you? What is the actual rate of data corruption or loss prevented or corrected by ZFS due to its checksumming in daily usage?I have experienced data corruption due to hardware failures in the past.Hardware failures like?The typical ones. Bad sectors, failed flash memory banks, failed ram modules.And only ZFS detected them? It won't help much with duff RAM. http://www.zdnet.com/blog/storage/dram-error-rates-nightmare-on-dimm-street/638Yes, I've seen that. It's for RAM, not disk errors detected through ZFS checksumming.And RAM has nothing to do with the data on the disks.that depends A flipped bit in the write cache will propagate to the disk. _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |