[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] how to start VMs in a particular order

  • To: xen-users@xxxxxxxxxxxxx
  • From: Joost Roeleveld <joost@xxxxxxxxxxxx>
  • Date: Mon, 30 Jun 2014 15:19:09 +0200
  • Delivery-date: Mon, 30 Jun 2014 13:19:36 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>

On Monday 30 June 2014 13:43:16 lee wrote:

> Joost Roeleveld <joost@xxxxxxxxxxxx> writes:

> > On Sunday 29 June 2014 17:35:17 lee wrote:

> >> "J. Roeleveld" <joost@xxxxxxxxxxxx> writes:

> >> > Try to read the SMART-values of the disk.

> >>

> >> I'm not sure how to do that, and what would they tell me?

> >

> > Either connect the disks directly to a sata port on a mainboard (normal

> > desktop would suffice). Disabling the raid-functionality of the card might

> > also suffice.

> > Then use (assuming the disk is /dev/sda)

> > # smartctl --all /dev/sda


> IIRC, there is a way to somehow display the smart info, probably with

> arcconf. I'd rather not use that if it might cause problems.

> Connecting them to SATA ports would be going to lengths. In any case,

> I'd get some numbers that won't tell me anything, and that three disks

> would suddenly go bad only because they are connected to a different

> controller seems very unlikely.


Check the howtos for smartctl, they explain how to interpret the data.

I'd recommend:



> >> I know, they aren't suited for this purpose. Yet they have been working

> >> fine on the P800, and that three disks should decide to go bad in a way

> >> that blocks the controller (or whatever happens) every now and then

> >> seems unlikely.

> >

> > No, it doesn't.


> Why not?


Because I've seen it happen.

WD makes good disks, but those 2TB green drives you are using gave me the largest amount of failures I ever experienced.

I don't even bother sending them back for warranty replacement anymore.


> These disks might never work with this controller. That doesn't mean

> that they have gone bad.


True, incompatibilities always exist.


> > Does the error occur after the server has been idle for a while? Or when

> > the disks are being stressed?


> I haven't seen any relation between disk usage and crashes. There seem

> to have been different reasons for crashing, i. e. first it would crash

> with "swiotbl is full",


That happens when the buffer is full, from a very quick read on the subject (so please, someone with more knowledge, please correct me if I am mistaken), this can be caused when the underlying I/O system is not able to keep up.


> then with "arcconf seems to hang" and now with

> "scsi bus hanging?".


These might be different ways of showing the same error, just being passed on to a different subsystem.


> I upgraded the kernel with one from Debian backports, then a couple days

> later there was another kernel upgrade when I removed the status

> checking. So it crashed with "scsi bus hanging?", and I changed the PHY

> setting of the controller again:


> The controller has PHY settings, ranging from 0--5, which can be changed

> for each disk individually. They were all on 5 to begin with, and the

> controller had trouble to detect the SATA disks on 5. I changed them

> all to 0 because it's the default, and the docs say that's supposed to

> work best. Since that, it doesn't have problems detecting the disks.


> It still crashed with PHY on 1, and I'm on 2 now. It hasn't crashed in

> over a day yet [knocks on wood]. If it works now, I'll leave it at 2;

> if it crashes again, I'll increase to 3 ...


Interesting, while googling for the PHY setting, I come across the following URL:



The following comes from there:


The reason your Sata drives are running at 1.5Gb/s vs 3.0Gb/s on your server is because their was a bug in the backplane that caused 30 second freezes under heavy workloads. I have an X3650 which I believe has the same backplane as yours. I'm also running Serveraid 8k.

They found out that the SAS expander that bridges to the Sata devices had a limitation that prevented it from operating at 3.0Gb speeds on SATA II devices. Here is the actual link to the article.




You might want to look into that, as it's the same server and raid-card as you are using.

Do note, the website for that IBM-link does not work at the moment.


> Apparently this PHY setting is at the lowest level of the SATA protocol

> and has something to do with how the link between the devices is

> established. So what happens when the link between a disk and the

> controller suddenly goes down and cannot be re-established?


> I'd expect the controller to handle that gracefully, especially since

> it's hot-plug capable with SAS drives. Perhaps it blocks, trying to

> re-establish the link because the disk is still present, and is

> unsuccessful until rebooted.


True, but, SATA drives don't always work when used with port multipliers, which from the above, I think you are actually using.


> > If the former, then you need to figure out how to AVOID the disks to enter

> > powersaving mode. It takes time for the disks to spin up again afterwards.

> > The raid controller is timing out on access to the disks.

> >

> > If the latter, then you might have issues on the drives themselves which

> > the drives are trying to solve themselves.

> >

> > My guess is that it is the former. (eg. when the server has been idle for

> > a

> > while)


> It's been idle over night and didn't crash. Since the disks are

> data-only, there isn't anything accessing them unless I do something

> with the data. If it was powersaving causing problems, chances are that

> I'd have had problems with it before.


It would depend on how fast data is pushed towards the disks when they need to come out of powersaving.


> But then, it seems that an SATA link goes down or can go down when a

> disk saves power. So you might be right: disk goes to sleep, controller

> cannot re-establish link because of PHY settings, and then things hang.


Yep, it all depends on what is happening, without proper errorlogs and reproducable crashes, it will be difficult to determine exactly what is happening.


> Is it even possible to disable the power management of WD20EARS in such

> a way that the SATA link remains up at all times? I never did anything

> about power management with these disks.


Possibly, check the WD website for options, or even contact them directly?


> >> So I think it's more likely an incompatibility of these disks with the

> >> ServeRaid controller than the disks being bad, and I'd have to replace

> >> all of them. Or this controller just sucks.

> >

> > Yep, incompatibility. Not necessarily with these disks, but with the

> > powersaving settings in the disks firmware. I believe there are tools

> > available you could use to adjust those settings. But I have no

> > experience with them and you need to connect the disks directly to a

> > standard sata port and use ms windows. (As I think those are ms windows

> > tools)


> Hm, I don't have windoze. And won't the settings be lost once the

> computer/disk is turned off?


There are 2 reasons why I never did it:

1) The tool actually changes the firmware

2) I swapped all those disks for WD Red drives

The green ones are currently in my desktop machines and I notice performance drops and freezes occasionally.


> >> IBM has supposedly fixed such issues with firmware updates, and

> >> I updated everything I could even before installing the disks.

> >

> > Check the settings on the raid card for powersaving/spindown/powerup

> > timeouts/....


> IIRC, arcconf said power management for the disks is disabled, and I

> think the controller might have spin-up settings to spin up the disks

> one after the other when booting. For now, I don't want to touch

> anything and see if it crashes again. If it does, I'll see what I can

> find out about power management. That pm causes problems seems to make

> the most sense now.


The green disks have powermanagement inside the firmware, not easily overwritten by the controller or OS.

I tested this by telling the OS to disable all powersaving. The disk vibration stopped even though the OS claimed it was still on.


> > You could try changing the raid controller?


> Maybe, over time, if I can get one that fits and which doesn't have the

> 2TB limit. I'd have to connect it somehow to the drive enclosure. The

> P800 is a rather big card, and even if I can plug it into the server,

> how would I connect it?


What kind of cable is running between the Serveraid (I believe the actual port is on the mainboard, based on the pictures I saw online) and the enclosure?


> >> So there I'm stuck :( The plan was to have my data on the server.

> >> Perhaps I'll have to declare the experiment as failed and sell the

> >> server.

> >

> > Not necessarily, but I would advice against using green drives in a server

> > when using hardware raid cards.


> I'd advise against that, too. None of this was planned when I bought

> the WD20EARS; they were bought to be used with software raid. Suitable

> disks would have cost 2.5 times as much.


How much is your data worth?


> >> I could probably run the disks as JBOD. If they are incompatible with

> >> the controller, that won't help.

> >

> > Try putting the disks through individually to the OS. Then use Linux

> > software raid (mdadm) to do the RAID. That should work better as the

> > RAID-software on the card won't end up with timeout issues after

> > powersaving kicks in.

> If it was merely timing issues, the controller should, at worst, fail

> the disk, shouldn't it? If it's issues with the SATA link going away

> and not coming back, the problem would persist with JBOD.




> I might try JBOD, though, because I'm tempted to switch to ZFS. But

> first the hardware needs to be stable.


Just a quick question.

According to:



The ServeRAID 8K is a raid-controller in the size of a DIMM-card.

There is no room for the SAS port, and on the picture below, I see a SAS-port right next to the slot. What does the mainboard/BIOS see when you remove the controller?


> >> Perhaps the controller is broken. Or it's something that xen does.

> >

> > Xen has nothing to do with this.

> > Most likely: raid-controler <-> disks incompatibility.


> Well, "swiotlb is full" looks more like kernel/xen than anything

> else. Or it's a symptom caused by an underlying problem like disk

> incompatibility. I'm still undecided about whether this is the kind of

> problem that has multiple causes or not.


To me, it sounds like a buffer filling up because the part emptying the buffer can't keep up.


> >> I wish it was a feature of xen --- that would make sense, but how would

> >> xen know when a VM is fully up ...

> >

> > It can, actually.

> >

> > If you have client-utilities running inside the VM, those can check easily

> > when the VM is fully booted. (put those to start last, for instance)

> >

> > Then those utilities use the xen-api to inform the host.

> > Read up on xenfs, it is usable to communicate between the guest and the

> > host.

> Client utilities? Xenfs? Hmmm ... Why don't ppl use that?


Actually, they do.

See XCP or XenServer, the client utilities you install inside the guests there do exactly that.

The thing is, if you want to use Xen native, then you need to write the tools yourself.


Similar methods are used by VMWare and VirtualBox.




Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.