[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Cheap IOMMU hardware and ECC support importance

Gordan Bobic <gordan@xxxxxxxxxx> writes:

> On 07/04/2014 07:03 PM, lee wrote:
>> Gordan Bobic <gordan@xxxxxxxxxx> writes:
>>> http://lmgtfy.com/?q=zfs+linux&l=1
>> Funny --- how many findings do you get?  A couple million?
> It's an "I'm feeling lucky" link, so the number of findings you get is
> 1, and it even forwards you straight to it.

I understood your intention to insult me before the animation finished,
so I didn't wait.  Anyway, picking one random result out of a great many
isn't any more helpful than displaying the great many.

>> 'apt-cache search zfs' is *much* more relevant.
> I think that by saying this you have just demonstrated that the rest
> of us that have participated in this thread have largely been wasting
> our time.

if you think so

>>>> Does ZFS do that?  Since it's about keeping the data safe, it might have
>>>> a good deal of protection against user errors.
>>> I don't think it's possible to guard against user errors. If you're
>>> concerned about user errors, get someone else to manage your machines
>>> and not give you the root password.
>> It is possible to guard.  It's not possible to prevent them.
> Sacrificing productivity for hand holding  is not the *nix
> paradigm. It's competence of bust. I for one don't want every command
> I type to ask "are you sure" before it does what I told it to. All it
> achieves it desensitizes you to the question and you end up saying y
> automatically after a while, without considering what it even said.

All the commands you issue are such that they destroy whole file

>>> You don't have to rebuild a pool. The existing pool is modified in
>>> place and that usually takes a few seconds. Typically the pool version
>>> headers get a bump, and from there on ZFS knows it can put additional
>>> metadata in place.
>>> Similar happens when you toggle deduplication on a pool. It puts the
>>> deduplication hash table headers in place. Even you remove the volume
>>> that has been deduplicated and don't have any deduplicated blocks
>>> afterwards, the headers will remain in place. But that doesn't break
>>> anything and it doesn't require rebuilding of a pool.
>> Then it should be easy to turn features off later.
> You can, but for example disabling compression on a compressed pool
> won't decompress all the data. It will only make sure the data written
> from that point on isn't compressed. If you want to actually
> decompress the data, you'll have to copy it to an uncompressed file
> system on the same pool, then destroy the old, compressed file system.

Why isn't there a command to uncompress the compressed data?

> systems. ZFS is the only file system I have used for the data on which
> I never had to reach for backups.

One of the reasons for this may very well be that you know ZFS so well.
I don't know it at all.

We probably have a very different understanding or use of file systems.
I have some data I need stored, so I create a file system (since I can't
very well store the data without one) and store data on it, and that's
all.  I'm not "using" the file system for anything more than that.  The
simpler and the more reliably that works, the better.

You create a file system and use it for making snapshots, adding
features and whatever other complicated things there are, besides
storing some data.  You're using an exotic file system which isn't
widely supported on Linux and run into bugs and problems which you were
lucky to be able to recover from by talking to developers and

If I was to switch to this file system, I'd much more likely than you
make user errors because I don't know this complicated file system, and
I might run into problems or bugs I might not be able to recover from
because I don't have access to the developers or supporters.  The
unknown file system would have the advantage that it could prevent
silent data corruption, which is a problem I haven't noticed yet.  Such
a switch isn't very appealing, as great as this file system might be.

>> Perhaps nothing of what you're saying about ZFS is true ;)
> OK, you got me - I confess: it's all a part of my hidden agenda to
> waste my time debating the issue with someone who hasn't progressed
> beyond using software that isn't in their distribution's package
> repository.

If you think I haven't, it must be true.

>> You seem to have a lot of disks failing.
> I do, but it's slowing down dramatically as I'm running out of Seagates.

Hm, so Seagates haven't changed over the last 20 years ...  Fortunately,
I have two spares for the two I have.

>> At how many errors do you need to replace a disk?
> Depends on the disk.
> One of my Seagates rports the following line in SMART:
>   5 Reallocated_Sector_Ct   0x0033   063   063   036    Pre-fail
> Always       -       1549
> So it has 1549 reallocated sectors at the moment. The stats start at
> 100, and the threshold for the disk needing to be replaced is
> 36. AFAIK, these are percentages. So, 1549 sectors = 100-63=37% of
> spare sectors used. That would imply that this particular disk has
> approximately 4100 spare sectors, and that it should be replaced when
> the number of them remaining falls below 36%.
>> Are sectors that had errors being re-used,
> You'll have to ask your disk manufacturer that. WD and Samsung
> might. Or they could just be lying about the number of reallocated
> sectors they have. WD and Samsung drives seem to have a feature where
> a pending sector doesn't convert into a reallocated sector, which
> implies either the reallocation count is lying the the previously
> failed sectors are being re-used if the data sticks to them within the
> limit of ECC's ability to recover.

So the smart numbers don't really give you an answer, unless the disk
manufacturer told you exactly what's actually going on.

I was thinking of errors detected by ZFS, though.  What if you see a
few:  Do you replace the disk, or do you wait until there are so many?

>> or does ZFS keep a list of sectors not to use anymore?
> As I said before, disk's handle their own defects, and have done for
> the past decade or two. File systems have long had no place in keeping
> track of duff sectors on disks.

So ZFS may silently loose redundancy (and in a bad case data), depending
on what the disks do.  And there isn't any way around that, other than
increasing redundancy.

>> That would mean that a disk which has been failed due to error
>> correction taking too long may still be fine.
> Yes. Most disks have some reallocated sectors after a while.

Hm, with the failed disk I have, it must be several because it was
failed repeatedly.  It might be interesting to look at the smart
numbers, and perhaps I should use it to try out ZFS.  It would be a pity
to just waste it.

Knowledge is volatile and fluid.  Software is power.

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.