[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] x86/microcode: Synchronize late microcode loading



On Mon, Apr 16, 2018 at 04:26:09AM -0600, Jan Beulich wrote:
>>>> On 16.04.18 at 08:20, <chao.gao@xxxxxxxxx> wrote:
>> On Fri, Apr 13, 2018 at 09:49:17AM -0600, Jan Beulich wrote:
>>>>>> On 30.03.18 at 08:59, <chao.gao@xxxxxxxxx> wrote:
>>>> +static int do_microcode_update(void *_info)
>>>> +{
>>>> +    struct microcode_info *info = _info;
>>>> +    int error, ret = 0;
>>>> +
>>>> +    error = __wait_for_cpus(&info->cpu_in, USEC_PER_SEC);
>>>
>>>Why this long a timeout here?
>> 
>> I just use the same timeout as the patch on linux kernel side. As we
>> know if the timeout is too small, updating microcode may be likely to
>> failed even if other CPUs disabled interrupt temporally.
>> 
>> If you object to such a long timeout (for Xen may need much smaller
>> time to rendezvous all CPUs compared to linux kernel because Xen doesn't
>> have device drivers which may malfunction), how about just use the
>> default timeout, 30ms, used by live patching? if it is also not
>> good enough, then we make it an option which comes from callers.
>
>Yes, 30ms is likely to be acceptable.
>
>>>> +    if ( error )
>>>> +    {
>>>> +        ret = -EBUSY;
>>>> +        return ret;
>>>> +    }
>>>> +
>>>> +    error = microcode_update_cpu(info->buffer, info->buffer_size);
>>>> +    if ( error && !ret )
>>>> +        ret = error;
>>>> +    /*
>>>> +     * Increase the wait timeout to a safe value here since we're 
>>>> serializing
>>>> +     * the microcode update and that could take a while on a large number 
>>>> of
>>>> +     * CPUs. And that is fine as the *actual* timeout will be determined 
>>>> by
>>>> +     * the last CPU finished updating and thus cut short
>>>> +     */
>>>> +    error = __wait_for_cpus(&info->cpu_out, USEC_PER_SEC * 
>>>> num_online_cpus());
>>>
>>>And this one's even worse, in particular on huge systems. I'm afraid such a 
>>>long
>>>period of time in stop-machine context is going to confuse most of the 
>>>running
>>>domains (including Dom0). There's nothing inherently wrong with e.g. 
>>>processing
>>>the updates on distinct cores (and even more so on distinct sockets) in 
>>>parallel.
>> 
>> I cannot say for sure. But the original patch does want updating
>> microcode be performed one-by-one.
>
>And they don't restrict the "when" aspect in any way? I.e. all sorts of
>applications (and perhaps even KVM guests) may be running?

No any restriction on the timing I think. Ashok, could you confirm this?

>
>>>Therefore revising the locking in microcode_update_cpu() might be a necessary
>>>prereq step.
>> 
>> Do you mean changing it to a per-core or per-socket lock?
>
>Either changing to such, or introducing a second, more fine grained lock.

I will introduce a per-core lock. Then only one thread in a core will
try to update microcode as an optimization. For I am not sure whether it
is permitted to update distinct cores in parallel, I am inclined to keep
the original logic.

Thanks
Chao

>
>>>Or alternatively you may need to demand that no other running
>>>domains exist besides Dom0 (and hope the best for Dom0 itself).
>>>
>>>I also don't think there's any point invoking the operation on all HT 
>>>threads on a
>>>core, but I realize stop_machine_run() isn't flexible enough to allow such.
>> 
>> Only one thread in a core will do the actual update. Other threads only
>> check the microcode version and find the version is already
>> update-to-date, then exit the critical region. Considering the check may
>> don't need much time, I wonder whether it can significantly benefit the
>> overall time to invoking the operation on all HT threads on a core?  
>
>With better parallelism this would be less of an issue. Plus, as said - the
>infrastructure we have at present wouldn't allow anyway what I've
>described above, and hand-rolling another "flavor" of the stop-machine
>logic is quite certainly out of question. To answer your question though:
>Taking as the example a 4-threads-per-core system with sufficiently
>many cores, I'm afraid the overall time spent in handling the
>"uninteresting" threads may become measurable. But of course, if actual
>numbers said otherwise, I'd be fine.
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.