[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] how to start VMs in a particular order

Joost Roeleveld <joost@xxxxxxxxxxxx> writes:

> On Sunday 29 June 2014 17:35:17 lee wrote:
>> "J. Roeleveld" <joost@xxxxxxxxxxxx> writes:
>> > Try to read the SMART-values of the disk.
>> I'm not sure how to do that, and what would they tell me?
> Either connect the disks directly to a sata port on a mainboard (normal 
> desktop would suffice). Disabling the raid-functionality of the card might 
> also 
> suffice.
> Then use (assuming the disk is /dev/sda)
> # smartctl --all /dev/sda

IIRC, there is a way to somehow display the smart info, probably with
arcconf.  I'd rather not use that if it might cause problems.
Connecting them to SATA ports would be going to lengths.  In any case,
I'd get some numbers that won't tell me anything, and that three disks
would suddenly go bad only because they are connected to a different
controller seems very unlikely.

>> I know, they aren't suited for this purpose.  Yet they have been working
>> fine on the P800, and that three disks should decide to go bad in a way
>> that blocks the controller (or whatever happens) every now and then
>> seems unlikely.
> No, it doesn't.

Why not?

These disks might never work with this controller.  That doesn't mean
that they have gone bad.

> Does the error occur after the server has been idle for a while? Or when the 
> disks are being stressed?

I haven't seen any relation between disk usage and crashes.  There seem
to have been different reasons for crashing, i. e. first it would crash
with "swiotbl is full", then with "arcconf seems to hang" and now with
"scsi bus hanging?".

I upgraded the kernel with one from Debian backports, then a couple days
later there was another kernel upgrade when I removed the status
checking.  So it crashed with "scsi bus hanging?", and I changed the PHY
setting of the controller again:

The controller has PHY settings, ranging from 0--5, which can be changed
for each disk individually.  They were all on 5 to begin with, and the
controller had trouble to detect the SATA disks on 5.  I changed them
all to 0 because it's the default, and the docs say that's supposed to
work best.  Since that, it doesn't have problems detecting the disks.

It still crashed with PHY on 1, and I'm on 2 now.  It hasn't crashed in
over a day yet [knocks on wood].  If it works now, I'll leave it at 2;
if it crashes again, I'll increase to 3 ...

Apparently this PHY setting is at the lowest level of the SATA protocol
and has something to do with how the link between the devices is
established.  So what happens when the link between a disk and the
controller suddenly goes down and cannot be re-established?

I'd expect the controller to handle that gracefully, especially since
it's hot-plug capable with SAS drives.  Perhaps it blocks, trying to
re-establish the link because the disk is still present, and is
unsuccessful until rebooted.

> If the former, then you need to figure out how to AVOID the disks to enter 
> powersaving mode. It takes time for the disks to spin up again afterwards. 
> The 
> raid controller is timing out on access to the disks.
> If the latter, then you might have issues on the drives themselves which the 
> drives are trying to solve themselves.
> My guess is that it is the former. (eg. when the server has been idle for a 
> while)

It's been idle over night and didn't crash.  Since the disks are
data-only, there isn't anything accessing them unless I do something
with the data.  If it was powersaving causing problems, chances are that
I'd have had problems with it before.

But then, it seems that an SATA link goes down or can go down when a
disk saves power.  So you might be right: disk goes to sleep, controller
cannot re-establish link because of PHY settings, and then things hang.

Is it even possible to disable the power management of WD20EARS in such
a way that the SATA link remains up at all times?  I never did anything
about power management with these disks.

>> So I think it's more likely an incompatibility of these disks with the
>> ServeRaid controller than the disks being bad, and I'd have to replace
>> all of them.  Or this controller just sucks.
> Yep, incompatibility. Not necessarily with these disks, but with the 
> powersaving settings in the disks firmware. I believe there are tools 
> available 
> you could use to adjust those settings. But I have no experience with them 
> and 
> you need to connect the disks directly to a standard sata port and use ms 
> windows. (As I think those are ms windows tools)

Hm, I don't have windoze.  And won't the settings be lost once the
computer/disk is turned off?

>> IBM has supposedly fixed such issues with firmware updates, and
>> I updated everything I could even before installing the disks.
> Check the settings on the raid card for powersaving/spindown/powerup 
> timeouts/....

IIRC, arcconf said power management for the disks is disabled, and I
think the controller might have spin-up settings to spin up the disks
one after the other when booting.  For now, I don't want to touch
anything and see if it crashes again.  If it does, I'll see what I can
find out about power management.  That pm causes problems seems to make
the most sense now.

> You could try changing the raid controller?

Maybe, over time, if I can get one that fits and which doesn't have the
2TB limit.  I'd have to connect it somehow to the drive enclosure.  The
P800 is a rather big card, and even if I can plug it into the server,
how would I connect it?

>> So there I'm stuck :(  The plan was to have my data on the server.
>> Perhaps I'll have to declare the experiment as failed and sell the
>> server.
> Not necessarily, but I would advice against using green drives in a server 
> when using hardware raid cards.

I'd advise against that, too.  None of this was planned when I bought
the WD20EARS; they were bought to be used with software raid.  Suitable
disks would have cost 2.5 times as much.

>> I could probably run the disks as JBOD.  If they are incompatible with
>> the controller, that won't help.
> Try putting the disks through individually to the OS. Then use Linux software 
> raid (mdadm) to do the RAID. That should work better as the RAID-software on 
> the card won't end up with timeout issues after powersaving kicks in.

If it was merely timing issues, the controller should, at worst, fail
the disk, shouldn't it?  If it's issues with the SATA link going away
and not coming back, the problem would persist with JBOD.

I might try JBOD, though, because I'm tempted to switch to ZFS.  But
first the hardware needs to be stable.

>> Perhaps the controller is broken.  Or it's something that xen does.
> Xen has nothing to do with this.
> Most likely: raid-controler <-> disks incompatibility.

Well, "swiotlb is full" looks more like kernel/xen than anything
else. Or it's a symptom caused by an underlying problem like disk
incompatibility.  I'm still undecided about whether this is the kind of
problem that has multiple causes or not.

>> I wish it was a feature of xen --- that would make sense, but how would
>> xen know when a VM is fully up ...
> It can, actually.
> If you have client-utilities running inside the VM, those can check easily 
> when the VM is fully booted. (put those to start last, for instance)
> Then those utilities use the xen-api to inform the host.
> Read up on xenfs, it is usable to communicate between the guest and the host.

Client utilities?  Xenfs?  Hmmm ...  Why don't ppl use that?

Knowledge is volatile and fluid.  Software is power.

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.