[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V4 2/2] x86, amd_ucode: Skip microcode updates for final levels

On 8/12/2015 4:38 AM, Jan Beulich wrote:
On 11.08.15 at 21:11, <aravind.gopalakrishnan@xxxxxxx> wrote:
Some of older[Fam10h] systems require that certain number of
applied microcode patch levels should not be overwritten by
the microcode loader. Otherwise, system hangs are known to occur.

The 'final_levels' of patch ids have been obtained empirically.
Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996
for details of the issue.

The short version is that people have predominantly noticed
system hang issues when trying to update microcode levels
beyond the patch IDs below.
[0x01000098, 0x0100009f, 0x010000af]

 From internal discussions, we gathered that OS/hypervisor
cannot reliably perform microcode updates beyond these levels
due to hardware issues. Therefore, we need to abort microcode
update process if we hit any of these levels.
While the patch itself looks fine now, I'm still hesitant to take this
(even more so after having read through the bugzilla entry
linked to above): The list being established empirically - will it be
ever growing?

No, although the list is established empirically, it is not going to continue to grow..
(see below..)

  Did you internally gain understanding of what it
actually is that goes wrong (and hence can perhaps narrow
down the conditions for the hangs to occur)? Have there been
any checks whether indeed _all_ systems at the listed ucode
levels are affected?

Yeah, HW architects mentioned they are aware of the problem and that it's relevant only on Fam10h (which is why the list would not grow). And they verified that it affects all the systems at the listed microcode levels.

Also that's leaving aside the question of what unfixed CPU issues
people now being prevented from doing the ucode update are
going to run into.

Right. So, the problem is that the user may have a perfectly normal working system at the listed microcode levels, but the hang occurs *only* when you try to update the patch level. Hence the recommendation from HW architects to hold down the microcode level from
OS/hypervisor POV and not go ahead with the update process.

If the user has to update the microcode levels beyond these levels, then BIOS updates are an option.



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.