[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 4/6] microcode: don't call apply_microcode() in cpu_request_microcode()
On Thu, Nov 29, 2018 at 10:46:05AM +0100, Roger Pau Monné wrote: >On Thu, Nov 29, 2018 at 12:28:46PM +0800, Chao Gao wrote: >> On Wed, Nov 28, 2018 at 04:02:25PM +0100, Roger Pau Monné wrote: >> >On Wed, Nov 28, 2018 at 01:34:14PM +0800, Chao Gao wrote: >> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c >> >> index 8350d22..cca7b2c 100644 >> >> --- a/xen/arch/x86/microcode.c >> >> +++ b/xen/arch/x86/microcode.c >> >> @@ -233,20 +233,12 @@ int microcode_resume_cpu(unsigned int cpu) >> >> return err; >> >> } >> >> >> >> -static int microcode_update_cpu(const void *buf, size_t size) >> >> +static int microcode_update_cpu(void) >> >> { >> >> int err; >> >> - unsigned int cpu = smp_processor_id(); >> >> - struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu); >> >> >> >> spin_lock(µcode_mutex); >> >> - >> >> - err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig); >> >> - if ( likely(!err) ) >> >> - err = microcode_ops->cpu_request_microcode(cpu, buf, size); >> >> - else >> >> - __microcode_fini_cpu(cpu); >> >> - >> >> + err = microcode_ops->apply_microcode(smp_processor_id()); >> >> spin_unlock(µcode_mutex); >> >> >> >> return err; >> >> @@ -259,7 +251,7 @@ static long do_microcode_update(void *_info) >> >> >> >> BUG_ON(info->cpu != smp_processor_id()); >> >> >> >> - error = microcode_update_cpu(info->buffer, info->buffer_size); >> >> + error = microcode_update_cpu(); >> > >> >Why don't you just set info->error = microcode_update_cpu()? >> > >> >AFAICT this is done to attempt to update the remaining CPUs if one >> >update failed? >> >> Yes. But this patch doesn't change the logic here. Actually, if HT is >> enabled and microcode is shared between the logical threads of the same >> core, so if one thread updates microcode successfully, its sibling would >> always fail in current logic. I am trying to explain why we cannot abort >> the update even though an error is met in current logic. It definitely >> can be solved by tweaking the logic slightly. >> >> > >> >Is there anyway to rollback to the previous state so all CPUs have the >> >same microcode? >> >> Seems it is not allowed to load a microcode with numeratically smaller >> revision according to 9.11.7.2. >> >> With patch 6, a panic() would be triggered if some cpus failed to do the >> update. I didn't try to change the logic here. >> >> >I assume nothing good will come out of running a >> >system with CPUs using different microcode versions, but maybe that's >> >not so bad? >> >> It is better that all CPUs have the same microcode revision. >> >> Linux kernel rejects late microcode update if finding some CPUs >> offlined. I may port this patch to Xen too in a separate patch. > >What happens with hotplug CPUs? > >Even if you disable late loading when there are offlined CPUs you >still need to handle hotplug CPUs, which IMO should share the same >path with offlined CPUs: the microcode update should be done ASAP >after bringing the CPU up. > In linux, CPU's being offline is just a logical offline. It may participate actions like MCE. It would lead to a situation that some cpus are using old microcode while some are using the new one, which introduces instability. CPU hotplug doesn't have such issue. >> > >> >> if ( error ) >> >> info->error = error; >> >> >> >> @@ -276,6 +268,8 @@ int >> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long >> >> len) >> >> { >> >> int ret; >> >> struct microcode_info *info; >> >> + unsigned int cpu = smp_processor_id(); >> >> + struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu); >> >> >> >> if ( len != (uint32_t)len ) >> >> return -E2BIG; >> >> @@ -294,10 +288,6 @@ int >> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long >> >> len) >> >> return ret; >> >> } >> >> >> >> - info->buffer_size = len; >> >> - info->error = 0; >> >> - info->cpu = cpumask_first(&cpu_online_map); >> >> - >> >> if ( microcode_ops->start_update ) >> >> { >> >> ret = microcode_ops->start_update(); >> >> @@ -308,6 +298,26 @@ int >> >> microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long >> >> len) >> >> } >> >> } >> >> >> >> + spin_lock(µcode_mutex); >> >> + >> >> + ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig); >> >> + if ( likely(!ret) ) >> >> + ret = microcode_ops->cpu_request_microcode(cpu, info->buffer, >> >> len); >> >> + else >> >> + __microcode_fini_cpu(cpu); >> >> + >> >> + spin_unlock(µcode_mutex); >> > >> >Why do you need to hold the lock here? >> > >> >microcode_update is already serialized and should only be executed on >> >a CPU at a time due to the usage of chained >> >continue_hypercall_on_cpu. >> >> microcode_resume_cpu() also uses the 'uci' and the global microcode cache. >> This lock is to prevent them happening simultaneously (someone is >> adding/replacing entries to a list and another is reading the list). >> All existing call sites of collec_cpu_info() and cpu_request_microcode() >> are protected with this lock. > >I mean, there should be some kind of protection to prevent the list >from changing at all while there's an update in progress, or else you >risk using different versions for different CPUs if there's a list >addition while a microcode update is in progress? Both adding to and using the microcode cache are done with microcode_mutex held. So I don't think it can happen. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |