[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] how to start VMs in a particular order



Joost Roeleveld <joost@xxxxxxxxxxxx> writes:

> On Tuesday 01 July 2014 23:48:51 lee wrote:
>> Joost Roeleveld <joost@xxxxxxxxxxxx> writes:
>> > Check the howtos for smartctl, they explain how to interpret the data.
>> > I'd recommend:
>> > http://www.smartmontools.org/
>> 
>> Ok, if I get to see the numbers, I can look there.  I never believed in
>> this smart thing ...
>
> You just wait for disks to die suddenly?

yes

You stock up on new disks just because smart might tell you that your
disks will die eventually?

>> You have seen three (or more) disks going bad all at the same time just
>> because they were connected to a different controller?
>
> Yes,

And smart didn't tell you they would go bad? ;)

> it was a cheap controller though, but it did actually kill any disk I 
> connected to it.

Hm now that really sucks and is rather unexpected.

> I was working at a computer shop at the time and the owner wanted us to try 
> different disks even though the first 2(!) died and those wouldn't work on 
> any 
> other system anymore.

I'll keep this in mind ... and in the future, I might as well connect
defective disks to unknown controllers before good ones to see if the
controller kills them.

>> They really aren't the greatest disk one can imagine.  I'd say they are
>> ok for what they are and better than their reputation, considering the
>> price --- you could get them for EUR 65 new a few years ago, maybe even
>> less, before all disk prices increased.  I'll replace them with
>> something suitable when they fail.
>
> For twice that, I got 3TB WD Red drives a few years ago, after the factories 
> came back online.

Are they twice as good?  I know they're quite a bit faster.  However,
when I bought the WD20EARS, there weren't any red ones, only RE ones,
which, IIRC, cost about 4 times as much as the WD20EARS.  That was just
too much.

>> Systems would go down all the time if exceeding their I/O capacity
>> would make them crash.
>
> It depends on how big the capacity is and how the underlying hardware handles 
> it.

The I/O capacity is either exceeded, or it isn't.  It doesn't matter how
big it is or how the hardware handles it.

Just copy some data from /dev/null to a file, and you'll exceed the I/O
capacity of your system.  Does it crash?

Start an application like seamonkey (with a hundred tabs open).  When
you have a fast CPU and a slow I/O system, doing so will exceed the I/O
capacity of your system.  Does it crash?

Boot some version of MS windoze from a HDD.  That exceeds the I/O
capacity of your system or otherwise ppl wouldn't see huge improvements
from booting from SSDs.  When it crashes, is it because the I/O capacity
was exceeded?

>> without a backplane in the way.  It is probably true that IBM --- and/or
>> Adaptec
>
> I believe you are using an IBM raid controller. Not an Adaptec part. At 
> least, 
> I can't see Adaptect in any of the documentation I saw online.

It's an IBM when you go by the labels and documentation.  Apparently
Adaptec made it (for IBM).

It's rather weird because it's a card that plugs into a special slot,
with apparently some/most of the controller integrated into the board.
Without the board, that card is useless.

>> --- ran into problems with SATA drives connected to the
>> controller they couldn't really solve, for otherwise there wouldn't be a
>> need to implement different PHY settings and even a utility in the
>> controllers' BIOS to let users change them.
>
> The backplane used in these systems, from my understanding, have a port 
> multiplier built-in. I think it is that part causing the problem.

Hm, did you find any documentation about it?  It would appear to be an
IBM-ESXS VSC7160 enclosure, and I haven't found any documentation for
it.  Apparently there are various drivers for it --- why would those be
needed?

>> The documentation speaks of "different SATA channels" and claims that
>> improvements have been made to the PHY settings, apparently hiding
>> what's actually going on.
>
> SAS and SATA controllers often talk about sata channels. My raid controller 
> even still calls them IDE-channels. It's just a name.

It's obfuscating --- a better explanation would be much more helpful.

>> Anyway, server uptime is 3 days, 9 hours now.  That's a great
>> improvement :)
>> 
>> So for what's it worth:  For WD20EARS on a ServeRaid 8k, try different
>> PHY settings.  PHY 2 seems to work much better than 0, 1 and 5.
>
> That is usefull news, especially if that keeps the system running. Maybe post 
> that online somewhere, including on that page?

That was my intention :)  There are archives of this mailing list,
aren't there?

>> > True, but, SATA drives don't always work when used with port multipliers,
>> > which from the above, I think you are actually using.
>> 
>> Hm, I doubt it.  The drive slots are numbered 0--5, and I can set a PHY
>> setting for each drive individually.  Would I be able to do that if a
>> PMP was used?
>
> Yes, the question is, does the PMP used handle that correctly?

How would the RAID controller know which disk is in which slot when they
are all behind a PMP?  It does know that.

>>  And can a single port keep up with 6 SAS drives?
>
> How many drives do you know of that can provide a sustained datastream of 
> 3Gb/s?
> Or, in the case of 6 drives, 500Mb/s?
> Assuming you have a drive that can sustain 200Mb/s, that still means a single 
> port can theoretically handle 3000 / 200 = 15 disks.
> With SSDs the picture is slightly different. With a sustained read speed of 
> 550Mb/s, you would get nearly 5.5 disks.
>
> So, yes, a single port can easily keep up with 6 SAS drives.

Aren't you confusing Gbit/sec with MB/sec?

3 Gbit/sec divided by 8 gives you Gbytes/sec, i. e. 0.375.  That's 375
MB/sec.  There's some protocol overhead, so you can keep up with three,
perhaps four disks, and you can't with six.

>> Yes --- I have two PHY settings left I can try if I have to.  If that
>> doesn't help, I can look into disabling power saving.
>
> I hope setting 2, as you mentioned above, keeps it stable.

It still hasn't crashed yet :)  I wonder if 3 or 4 might be better ...


-- 
Knowledge is volatile and fluid.  Software is power.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.