Xen project Mailing List

Re: [Xen-devel] [Patch v3 2/2] x86/microcode: Synchronize late microcode loading

To: Jan Beulich <JBeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Fri, 18 May 2018 15:21:14 +0800

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Ashok Raj <ashok.raj@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxx, Jun Nakajima <jun.nakajima@xxxxxxxxx>, tglx@xxxxxxxxxxxxx, Borislav Petkov <bp@xxxxxxx>

Delivery-date: Fri, 18 May 2018 07:26:38 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, May 16, 2018 at 07:46:48AM -0600, Jan Beulich wrote: >>>> On 16.05.18 at 15:25, <andrew.cooper3@xxxxxxxxxx> wrote: >> On 16/05/18 14:10, Jan Beulich wrote: >>>> +static int do_microcode_update(void *_info) >>>> +{ >>>> + struct microcode_info *info = _info; >>>> + unsigned int cpu = smp_processor_id(); >>>> + int ret; >>>> + >>>> + ret = wait_for_cpus(&info->cpu_in, MICROCODE_DEFAULT_TIMEOUT); >>>> + if ( ret ) >>>> + return ret; >>>> + >>>> + /* >>>> + * Logical threads which set the first bit in cpu_sibling_mask can do >>>> + * the update. Other sibling threads just await the completion of >>>> + * microcode update. >>>> + */ >>>> + if ( !cpumask_test_and_set_cpu( >>>> + cpumask_first(per_cpu(cpu_sibling_mask, cpu)), >>>> &info->cpus) ) >>>> + ret = microcode_update_cpu(info->buffer, info->buffer_size); >>>> + /* >>>> + * Increase the wait timeout to a safe value here since we're >>>> serializing >>>> + * the microcode update and that could take a while on a large number >>>> of >>>> + * CPUs. And that is fine as the *actual* timeout will be determined >>>> by >>>> + * the last CPU finished updating and thus cut short >>>> + */ >>>> + if ( wait_for_cpus(&info->cpu_out, MICROCODE_DEFAULT_TIMEOUT * >>>> + nr_phys_cpus) ) >>> I remain unconvinced that this is a safe thing to do on a huge system with >>> guests running (even Dom0 alone would seem risky enough). I continue to I think there are other operations may also endanger the security, stability of the whole system. We offer them with caveats. Same here, three different methods can be used to update microcode; the late update isn't perfect at this moment. At least, we provide a more reliable method to update microcode at runtime on systems with no so many cores. And for a huge system, admins can assess the risk and choose the most suitable method. They can completely avoid doing live updates and mandate a reboot and do it early since that's the most dependable method. >>> hope for comments from others, in particular Andrew, here. At the very >>> least I think you should taint the hypervisor when making it here. >> >> I see nothing in this patch which prevents a deadlock against the time >> calibration rendezvous. It think its fine to pause the time calibration >> rendezvous while performing this update. > >If there's a problem here, wouldn't that be a general one with >stop_machine()? I agree with Jan. It shouldn't be specific to the stop_machine() here. Anyhow, I will look into the potential deadlock you mentioned. > >> Also, what is the purpose of serialising the updates while all pcpus are >> in rendezvous? microcode_mutex which prevents doing the updates in parallel is not introduced by this patch. At present, We want to keep this patch and the update process simple. Could we just make it work first and try to work out some optimizations later? >> Surely at that point the best option is to initiate an >> update on all processors which don't have an online sibling thread with >> a lower thread id. > >I've suggested that before. I think Andrew's suggestion here is similar to the method which this patch is using. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.