[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Cheap IOMMU hardware and ECC support importance


  • To: xen-users@xxxxxxxxxxxxx
  • From: Gordan Bobic <gordan@xxxxxxxxxx>
  • Date: Sat, 05 Jul 2014 12:27:41 +0100
  • Delivery-date: Sat, 05 Jul 2014 11:28:02 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>

On 07/05/2014 11:33 AM, Kuba wrote:

any other FS. I have the same feeling about ZFS as Gordan - once you
start using it, you cannot imagine making do without it.

Why exactly is that?  Are you modifying your storage system all the
time
or making snapshots all the time?

Yes, I take snapshots all the time. This way it's easy for me to
revert VMs to previous states, clone them, etc. Same goes with my
regular data. And I replicate them a lot.

Hm, what for?  The VMs I have are all different, so there's no point in
cloning them.  And why would I clone my data?  I don't even have the
disk capacity for that and am glad that I can make a backup.

I tend to clone "production" VMs before I start fiddling with them, so
that I can test potentially dangerous ideas without any consequences.
Clones are "free" - they only start using more space when you introduce
some difference between the clone and the original dataset. You can
always 'promote' them so they become independent from the original
dataset (using more space as required). Cloning is just a tool that you
might or might not find useful.

This is, indeed, a most excellent point about a good, useful use of cloning in zfs. Thank you for pointing it out. :)

Once is often enough for me and it happened more then once. If I
hadn't done the checksumming myself, I probably wouldn't even have
known about it. Since I started using it, ZFS detected data corruption
several times for me (within a few years). But I don't own a data
center :) Actual error rates might depend on your workload, hardware,
probabilities and lots of other things. Here's something you might
find interesting:

Sure, the more data about failures detected by checksumming we would
collect, the more we might be able to make conclusions from it.  Since
we don't have much data, it's still interesting to know what failure
rates you have seen.  Is it more like 1 error in 50TB read or more like
1 error in 500TB or like 20 in 5TB?

I don't count them, I'd say 1 in 10TB. But that's not professional
research-grade statistical data, you shouldn't make decisions on it.

That there's a statistical rate of failure doesn't mean that these
statistical failures are actually seen in daily applications.

http://www.zdnet.com/blog/storage/dram-error-rates-nightmare-on-dimm-street/638


Yes, I've seen that.  It's for RAM, not disk errors detected through ZFS
checksumming.

And RAM has nothing to do with the data on the disks.

Unless the bit flip happens in the write-out buffer before it is committed to disk.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.