[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Cheap IOMMU hardware and ECC support importance
On 07/06/2014 11:11 AM, lee wrote: Does ZFS do that? Since it's about keeping the data safe, it might have a good deal of protection against user errors.I don't think it's possible to guard against user errors. If you're concerned about user errors, get someone else to manage your machines and not give you the root password.It is possible to guard. It's not possible to prevent them.Sacrificing productivity for hand holding is not the *nix paradigm. It's competence of bust. I for one don't want every command I type to ask "are you sure" before it does what I told it to. All it achieves it desensitizes you to the question and you end up saying y automatically after a while, without considering what it even said.All the commands you issue are such that they destroy whole file systems? Of course not, but there are few if any commands that have the ability to destroy a FS which ask for confirmation before doing so. You don't have to rebuild a pool. The existing pool is modified in place and that usually takes a few seconds. Typically the pool version headers get a bump, and from there on ZFS knows it can put additional metadata in place. Similar happens when you toggle deduplication on a pool. It puts the deduplication hash table headers in place. Even you remove the volume that has been deduplicated and don't have any deduplicated blocks afterwards, the headers will remain in place. But that doesn't break anything and it doesn't require rebuilding of a pool.Then it should be easy to turn features off later.You can, but for example disabling compression on a compressed pool won't decompress all the data. It will only make sure the data written from that point on isn't compressed. If you want to actually decompress the data, you'll have to copy it to an uncompressed file system on the same pool, then destroy the old, compressed file system.Why isn't there a command to uncompress the compressed data? Why would there be? If you want to uncompress it, copy the files to a new directory, remove the original directory, then rename the new directory. You could probably write a one line script to do that for you if it's such a big problem. systems. ZFS is the only file system I have used for the data on which I never had to reach for backups.One of the reasons for this may very well be that you know ZFS so well. I don't know it at all. I knew other file systems at least as well if not better, yet it didn't help. We probably have a very different understanding or use of file systems. I have some data I need stored, so I create a file system (since I can't very well store the data without one) and store data on it, and that's all. I'm not "using" the file system for anything more than that. The simpler and the more reliably that works, the better. > You create a file system and use it for making snapshots, adding features and whatever other complicated things there are, besides storing some data. You're using an exotic file system which isn't widely supported on Linux and run into bugs and problems which you were lucky to be able to recover from by talking to developers and supporters. > If I was to switch to this file system, I'd much more likely than you make user errors because I don't know this complicated file system, and I might run into problems or bugs I might not be able to recover from because I don't have access to the developers or supporters. The unknown file system would have the advantage that it could prevent silent data corruption, which is a problem I haven't noticed yet. Such a switch isn't very appealing, as great as this file system might be. And without a file system that detects said corruption for you, you will never notice it either. Perhaps nothing of what you're saying about ZFS is true ;)OK, you got me - I confess: it's all a part of my hidden agenda to waste my time debating the issue with someone who hasn't progressed beyond using software that isn't in their distribution's package repository.If you think I haven't, it must be true. You are the one that implied that not having the package in the distribution repository was such a big deal. At how many errors do you need to replace a disk?Depends on the disk. One of my Seagates rports the following line in SMART: 5 Reallocated_Sector_Ct 0x0033 063 063 036 Pre-fail Always - 1549 So it has 1549 reallocated sectors at the moment. The stats start at 100, and the threshold for the disk needing to be replaced is 36. AFAIK, these are percentages. So, 1549 sectors = 100-63=37% of spare sectors used. That would imply that this particular disk has approximately 4100 spare sectors, and that it should be replaced when the number of them remaining falls below 36%.Are sectors that had errors being re-used,You'll have to ask your disk manufacturer that. WD and Samsung might. Or they could just be lying about the number of reallocated sectors they have. WD and Samsung drives seem to have a feature where a pending sector doesn't convert into a reallocated sector, which implies either the reallocation count is lying the the previously failed sectors are being re-used if the data sticks to them within the limit of ECC's ability to recover.So the smart numbers don't really give you an answer, unless the disk manufacturer told you exactly what's actually going on. SMART numbers SHOULD give you the answer, unless the manufacturer has deliberately made the firmware lie about it in the interest of reducing warranty return rates. I was thinking of errors detected by ZFS, though. What if you see a few: Do you replace the disk, or do you wait until there are so many? Depends on the rate at which they are showing up. If every week the same disk throws a few errors, then yes, it is a good candidate for replacing. But usually there are other indications in SMART and syslog, e.g. command timeouts, bus resets, and similar. or does ZFS keep a list of sectors not to use anymore?As I said before, disk's handle their own defects, and have done for the past decade or two. File systems have long had no place in keeping track of duff sectors on disks.So ZFS may silently loose redundancy (and in a bad case data), depending on what the disks do. And there isn't any way around that, other than increasing redundancy. How do you define "silently"? How would you detect disk failure with any traditional (hardware or software) RAID arrangement? You have to configure some kind of monitoring, appropriate to your system. ZFS is no different. _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |