[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Wg-test-framework] oseleta[01] apparent BIOS bug (or maybe hardware problem)



Ian, 

We brought the servers back to the office to test.  We rebooted each server
30+ times both with Cntl-Alt + Del and then with  [shutdown -r -time  now]
command.  They worked every time.  It does appear that there is a difference
between Oseleta 1 and Oselta 0.  It seems Oseleta 1  takes a  minute and 20
seconds longer to see a login prompt than it does on Oseleta 0.  They are
both setup the same way and the BIOS revs are identical as is the BIOS
settings.

We will update the BIOS to the latest revision tomorrow and test again.

Paul

-----Original Message-----
From: Ian Jackson [mailto:Ian.Jackson@xxxxxxxxxxxxx] 
Sent: Wednesday, March 25, 2015 10:05 AM
To: Paul L. George
Cc: 'Don Koch'; 'Lars Kurth'; 'Ian Campbell';
wg-test-framework@xxxxxxxxxxxxxxxxxxxx
Subject: oseleta[01] apparent BIOS bug (or maybe hardware problem)

While doing commissioning tests of oseleta0 and oseleta1 I have found what
appears to be a BIOS bug, where warm reboots sometimes fail.

What happens is this:

 * osstest controller successfully autoinstalls Debian wheezy on test
   box (oseleta[01]).  (Box is configured to netboot; when boot from
   local disk is desired this is achieved by the osstest controller
   installing an appropriate pxelinux configuration on the tftp
   server.)
 * osstest controller installs Xen hypervisor in /boot and adjusts
   bootloader settings, and instructs the machine to reboot (with
   `ssh root@oseleta[01] init 6'.)
 * Test box goes down normally.
 * Serial logs show some BIOS messages relating to the boot.

 * Serial logs show BIOS messages stopping in the middle of the boot
   sequence.

   This is the problem I am reporting, and appears to be a BIOS bug.
   NB that some of the boot messages do not appear, so the problem
   occurs before the system starts to attempt netboot.  It is
   therefore not related to the bootloader or the pxelinux image.

 * osstest controller times out after 400 seconds, and collects
   a copy of the serial log file etc.
 * osstest controller reuses the machine for another test.  The
   next test involves power cycling the machine, after which it
   boots (into the provided network install image) just fine.

The failure seems to happen about one time in ten.  The rest of the time it
seems to work just fine (and then the complete reboot cycle takes ~135s).
Both machines in the pair are affected with roughly equal probability.

Examples of the failure include:

 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-i386-xl-qe
muu-ovmf-amd64/info.html
 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-i386-xl-qe
muu-ovmf-amd64/serial-oseleta1.log

 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-i386-xl-qe
mut-win7-amd64/info.html
 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-i386-xl-qe
mut-win7-amd64/serial-oseleta0.log

 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-amd64-xl-m
ultivcpu/info.html
 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-amd64-xl-m
ultivcpu/serial-oseleta1.log

 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-i386-xl-qe
muu-winxpsp3/info.html
 
http://logs.test-lab.xenproject.org/osstest/logs/50196/test-amd64-i386-xl-qe
muu-winxpsp3/serial-oseleta0.log

(In each case, scroll to the bottom of the serial log.)

Paul, can you please raise this problem with your supplier and/or Dell ?

Thanks,
Ian.


_______________________________________________
Wg-test-framework mailing list
Wg-test-framework@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/wg-test-framework


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.