Xen project Mailing List

Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading

To: Jan Beulich <JBeulich@xxxxxxxx>, Ashok Raj <ashok.raj@xxxxxxxxx>

Date: Thu, 19 Apr 2018 16:38:55 +0800

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Thu, 19 Apr 2018 08:43:52 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Apr 16, 2018 at 04:26:09AM -0600, Jan Beulich wrote: >>>> On 16.04.18 at 08:20, <chao.gao@xxxxxxxxx> wrote: >> On Fri, Apr 13, 2018 at 09:49:17AM -0600, Jan Beulich wrote: >>>>>> On 30.03.18 at 08:59, <chao.gao@xxxxxxxxx> wrote: >>>> +static int do_microcode_update(void *_info) >>>> +{ >>>> + struct microcode_info *info = _info; >>>> + int error, ret = 0; >>>> + >>>> + error = __wait_for_cpus(&info->cpu_in, USEC_PER_SEC); >>> >>>Why this long a timeout here? >> >> I just use the same timeout as the patch on linux kernel side. As we >> know if the timeout is too small, updating microcode may be likely to >> failed even if other CPUs disabled interrupt temporally. >> >> If you object to such a long timeout (for Xen may need much smaller >> time to rendezvous all CPUs compared to linux kernel because Xen doesn't >> have device drivers which may malfunction), how about just use the >> default timeout, 30ms, used by live patching? if it is also not >> good enough, then we make it an option which comes from callers. > >Yes, 30ms is likely to be acceptable. > >>>> + if ( error ) >>>> + { >>>> + ret = -EBUSY; >>>> + return ret; >>>> + } >>>> + >>>> + error = microcode_update_cpu(info->buffer, info->buffer_size); >>>> + if ( error && !ret ) >>>> + ret = error; >>>> + /* >>>> + * Increase the wait timeout to a safe value here since we're >>>> serializing >>>> + * the microcode update and that could take a while on a large number >>>> of >>>> + * CPUs. And that is fine as the *actual* timeout will be determined >>>> by >>>> + * the last CPU finished updating and thus cut short >>>> + */ >>>> + error = __wait_for_cpus(&info->cpu_out, USEC_PER_SEC * >>>> num_online_cpus()); >>> >>>And this one's even worse, in particular on huge systems. I'm afraid such a >>>long >>>period of time in stop-machine context is going to confuse most of the >>>running >>>domains (including Dom0). There's nothing inherently wrong with e.g. >>>processing >>>the updates on distinct cores (and even more so on distinct sockets) in >>>parallel. >> >> I cannot say for sure. But the original patch does want updating >> microcode be performed one-by-one. > >And they don't restrict the "when" aspect in any way? I.e. all sorts of >applications (and perhaps even KVM guests) may be running? No any restriction on the timing I think. Ashok, could you confirm this? > >>>Therefore revising the locking in microcode_update_cpu() might be a necessary >>>prereq step. >> >> Do you mean changing it to a per-core or per-socket lock? > >Either changing to such, or introducing a second, more fine grained lock. I will introduce a per-core lock. Then only one thread in a core will try to update microcode as an optimization. For I am not sure whether it is permitted to update distinct cores in parallel, I am inclined to keep the original logic. Thanks Chao > >>>Or alternatively you may need to demand that no other running >>>domains exist besides Dom0 (and hope the best for Dom0 itself). >>> >>>I also don't think there's any point invoking the operation on all HT >>>threads on a >>>core, but I realize stop_machine_run() isn't flexible enough to allow such. >> >> Only one thread in a core will do the actual update. Other threads only >> check the microcode version and find the version is already >> update-to-date, then exit the critical region. Considering the check may >> don't need much time, I wonder whether it can significantly benefit the >> overall time to invoking the operation on all HT threads on a core? > >With better parallelism this would be less of an issue. Plus, as said - the >infrastructure we have at present wouldn't allow anyway what I've >described above, and hand-rolling another "flavor" of the stop-machine >logic is quite certainly out of question. To answer your question though: >Taking as the example a 4-threads-per-core system with sufficiently >many cores, I'm afraid the overall time spent in handling the >"uninteresting" threads may become measurable. But of course, if actual >numbers said otherwise, I'd be fine. > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.