Xen project Mailing List

Re: [Xen-users] Cheap IOMMU hardware and ECC support importance

Date: Fri, 04 Jul 2014 20:03:56 +0200

Delivery-date: Fri, 04 Jul 2014 19:17:07 +0000

List-id: Xen user discussion <xen-users.lists.xen.org>

Mail-followup-to: xen-users@xxxxxxxxxxxxx

Gordan Bobic <gordan@xxxxxxxxxx> writes: >> On 07/02/2014 11:45 PM, lee wrote: >>> On Linux there is for all intents and purposes one implementation. >> >> Where is this implementation? Is it available by default? I only saw >> that there's a Debian package for ZFS which involves fuse. > > http://lmgtfy.com/?q=zfs+linux&l=1 Funny --- how many findings do you get? A couple million? 'apt-cache search zfs' is *much* more relevant. >> Does ZFS do that? Since it's about keeping the data safe, it might have >> a good deal of protection against user errors. > > I don't think it's possible to guard against user errors. If you're > concerned about user errors, get someone else to manage your machines > and not give you the root password. It is possible to guard. It's not possible to prevent them. > You don't have to rebuild a pool. The existing pool is modified in > place and that usually takes a few seconds. Typically the pool version > headers get a bump, and from there on ZFS knows it can put additional > metadata in place. > > Similar happens when you toggle deduplication on a pool. It puts the > deduplication hash table headers in place. Even you remove the volume > that has been deduplicated and don't have any deduplicated blocks > afterwards, the headers will remain in place. But that doesn't break > anything and it doesn't require rebuilding of a pool. Then it should be easy to turn features off later. >> You might enable a new feature and find that it causes >> problems, but you can't downgrade ... > > You could have to _use_ a feature that causes problems just because > it's available. And features that broken are rare, and non-critical. And what do I do then? Rebuild the pool to somehow downgrade to a previous version of ZFS? >>>> It seems that ZFS isn't sufficiently mature yet to use it. I haven't >>>> learned much about it yet, but that's my impression so far. >>> >>> As I said above - you haven't done your research very thoroughly. >> >> I haven't, yet all I've been reading so far makes me very careful. When >> you search for "zfs linux mature", you find more sources saying >> something like "it is not really mature" and not many, if any, that >> would say something like "of course you should use it, it works >> perfectly". > > There's a lot of FUD out there, mostly coming from people who have > neither tried it nor know what they are talking about. Whatever next? > "It must be true because I read it on the internet"? Perhaps nothing of what you're saying about ZFS is true ;) >> IIRC, when I had the WD20EARS in software RAID-5, I got messages about >> barriers being disabled. I tried to find out what that was supposed to >> tell me, and it didn't seem to be too harmful, and there wasn't anything >> I could do about it anyway. What if I use them as JBOD with ZFS and get >> such messages? > > No idea, I don't see any such messages. It's probably a feature of > your RAID controller driver. I didn't even have a RAID controller when that happened. >> So the wikipedia article about SATA is wrong? Or how does that work >> when any of the involved devices does not support some apparently >> substantial parts of the SATA protocol? > > You misunderstand. When I say "FIS" I am talking about FIS based > switching, as opposed to command based switching. Perhaps a lack of > clarity on my part, apologies for that. Np --- the article doesn't say much. >> As far as I've seen, that doesn't happen. Instead, the system goes >> down, trying to access the unresponsive disk indefinitely. > > I see a disk get kicked out all the time. Most recent occurrence was 2 > days ago. You seem to have a lot of disks failing. > "zfs status" shows you the errors on each disk in the pool. This > should be monitored along with regular SMART checks. Using ZFS doesn't > mean you no longer have to monitor for hardware failure, any more than > you can not monitor for failure of a disk in a hardware RAID array. At how many errors do you need to replace a disk? Are sectors that had errors being re-used, or does ZFS keep a list of sectors not to use anymore? >>>> Or how unreliable is a disk that spends significant amounts of time on >>>> error correction? >>> >>> Exactly - 7 seconds is about 840 read attempts. If the sector read >>> failed 840 times in a row, what are the chances that it will ever >>> succeed? >> >> Isn't the disk supposed not to use the failed sector once it has been >> discovered, meaning that the disk might still be useable? > > When a sector becomes unreadable, it is marked as "pending". Rad > attempts from it will return an error. The next write to it will cause > it to get reallocated from the spare sectors the disk comes with. As > far as I can tell, some disks try to re-use the sector when a write > for it arrives, and see if the data sticks to the sector within the > ability of the sector's ECC to recover. If it sticks, it's kept, if it > doesn't, it's reallocated. That would mean that a disk which has been failed due to error correction taking too long may still be fine. >>>> You seem to like the HGST ones a lot. They seem to cost more than the >>>> WD reds. >>> >>> I prefer them for a very good reason: >>> http://blog.backblaze.com/2014/01/21/what-hard-drive-should-i-buy/ >> >> Those guys don't use ZFS. They must have very good reasons not to. > > I don't know what they use. They're using ext4. -- Knowledge is volatile and fluid. Software is power. _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.