[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] fresh xcp 1.6 & xenserver install fails on first boot (kernel not found). [SOLVED]



So I've found the issue... But I don't yet understand the issue. If anyone has 
any ideas feel free to share.

To keep a long story short (cause it took days and I don't want to bore 
anyone...) I ended up going through a process of a/b testing narrowing down the 
conditions that reproduced the failed install. Nearing the end I found that it 
was only happening on equipment located in a particular cabinet - then I 
narrowed it down to be only reproducible on ONE machine - that one machine 
could produce the problem with either raid controller.

Here's the key - it would only happen when the unit was racked.

To all those that struggle after me, determining the in use modules (when I 
thought the issue might be a driver missing from the running kernel) :

lspci -k 

gives you a list of running modules - then opening up / extracting the initrd 
can allow you to look for the missing module.

In my case nothing was missing.

So I continued to work on my isolation of the issue - it turns out it was 
happening ONLY when a specific serial db9 to rj45 adapter is plugged in. That 
same adapter (or at least similar adapter) is in use on ALL our machines - 
however if I disconnect or replace the one on this one machine  the problem 
goes away.

I'm suspecting some sort of short in the adapter which is somehow affecting the 
server, but even then I'm stumped at how something on a serial port is 
affecting a boot process.

And why it doesn't affect windows or vmware booting on the same machine (the 
machine was once a windows machine, and until the attempted reinstall was a 
functioning esxi host).

At any rate, I have the unit labelled - and have looked inside for a solder 
blob - can't see anything yet.

It is conclusively the issue though. I can crash the boot process by connecting 
it, and allow it to work properly by disconnecting it.

Thanks to everyone who helped me as I looked for what Ian correctly identified 
as a "red herring" - I have still yet to receive any response on any of the 
other lists / forums - I guess the issue was overly unusual.

I'm very grateful that there are people on this list who can respond - even 
when it's outside the scope of the list.

Thanks guys!

Mitch


-----Original Message-----
From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxx] 
Sent: September 24, 2013 2:40 AM
To: Mitch (BitBlock)
Cc: 'eneal@xxxxxxxxxxxxxxxxx'; 'Xen-users@xxxxxxxxxxxxx'; 
'jaceksburghardt@xxxxxxxxx'
Subject: Re: [Xen-users] fresh xcp 1.6 & xenserver install fails on first boot 
(kernel not found).

On Tue, 2013-09-24 at 00:02 +0000, mitch@xxxxxxxxxxxx wrote:
> Someone mentioned I should verify the installed kernel had the proper 
> driver in it (which the iso obviously has as I can mount the file 
> system from it).

I think this is a red-herring.  From the description you have given I don't 
think you are getting anywhere near the point at which the kernel would want a 
driver for the hardware. You are failing at the bootloader stage to even load 
the kernel in the first place.

Are you able to get a full log of the boot, gibberish and all, perhaps using a 
serial console?

Have you validated the content of /boot/extlinux.conf?


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.