George: The motherboards for rimava 0 & 1 have arrived I am waiting on the AMD
chips. I sent the spec on the board to AMD.
Waiting for AMD chips to arrive
Paul:
I received the AMD Chips this morning and will forward them on to supplier, he will have them tomorrow. The Mother boards were delivered yesterday. We should have the upgraded servers back next week.
ACTION: Lars going sherry whether these have been sent (e.g. get tracking number)
=== oseleta[01] apparent BIOS bug (or maybe hardware problem) ==
Ian: While doing commissioning tests of oseleta0 and oseleta1 I have found
what appears to be a BIOS bug, where warm reboots sometimes fail.
Ian: cannot cover this in the call
ACTION: Paul to raise issue with DELL - first search for BIOS update first and then report back
ACTION: (as a second step) Lars can check whether the Linux Foundation has technical contacts at DELL. Mentioning that we are a LF Collab project has helped speed up things with Lenovo
Paul: Had not seen the issue
Ian: Not surprised as it happens 1 out of 10 times
Paul: could also be a serial port interaction problem
Ian, Don: Not very likely
Ian: Issue appears at both machines, why I think this is not likely a HW issue
Don: has AMT issues - can talk to one but can't connect to the other one
Don: Was able to do web connection to one machine, but
Ian: what we have to be able to do
* Access to BIOS settings, then don't need to use AMT
* Need a config of a BIOS that allows us BIOS access, but otherwise doesn't interfere with SERIAL access
Ian recommends to go through the web access and check identical settings. If there are discrepancies than that will give us a clue
Ian: does serial works most of the time?
ACTION: Don, George to follow up
George: Lenovo closed the case, but can re-open
Ian: Is there a problem with serial connection
Paul: Found a bad RGB45 connection (huxelrebe1), which is resolved BUT has not yet been verified
ACTION: Don to verify that RGB45 connection on huxelrebe1 has been fixed and report back
Ian: Elbling[01] and Pinot[01] have the correct settings now
Ian: discovered some issues when running OSSTEST, but Ian is handling them. Will contact George / Don as needed as part of commissioning tests. Found a few failures that are hard to explain, but could be explained by temporary failures of the switch or freezes of the OSSTEST VM. Have not yet confirmed.
Ian: Don, have you noticed any unexplained network lag, freezes, etc.
Don: has not seen/noticed anything
Ian: not specific to any specific time. Could have some Heisen bugs in tests / OSSTEST which have not surfaced earlier.
Ian: Can't completely ruled out a knocked cable as Don was in the COLO on Tye. Not likely though.
Ian: Also switch logs do not contain anything specific. If it is an infrastructure it is likely a networking problem.
Ian: no blocker going for going live
ACTION: Ian to watch further, otherwise no action
=== ETA for going live ===
Ian: have discovered at least 2 or 3 pairs of machines that are affected by kernel bugs / Xen bugs
Ian: Have 10 machines that can go into production
Ian: Maybe going live with 10 machines in the next 2-3 days
Ian: Overall capacity and performance slightly better than the old one
ACTION: Lars to bring up at team meeting to plan PR
Ian: had issues with some ARM boards, but these turned out to be firmware problems
Ian: confirmed to be fine now
George: wants to move one power cable (which is in the way) to be able to tidy cabling
ACTION: George to confirm with Ian which cable and negotiate timing as machine is live and in use
Ian: has to book out machine from test pool, then take it back in after cable is moved
Don: asked which machines are live
Ian: all machines are live that were handed over to me