On 06/05/2019 15:14, Jan Beulich wrote:
>>>> On 06.05.19 at 14:23, <jgross@xxxxxxxx> wrote:
>> On 06/05/2019 13:58, Jan Beulich wrote:
>>>>>> On 06.05.19 at 12:20, <jgross@xxxxxxxx> wrote:
>>>> On 06/05/2019 12:01, Jan Beulich wrote:
>>>>>>>> On 06.05.19 at 11:23, <jgross@xxxxxxxx> wrote:
>>>>>> On 06/05/2019 10:57, Jan Beulich wrote:
>>>>>>> . Yet then I'm a little puzzled by its use here in the first place.
>>>>>>> Generally I think for_each_cpu() uses in __init functions are
>>>>>>> problematic, as they then require further code elsewhere to
>>>>>>> deal with hot-onlining. A pre-SMP-initcall plus use of CPU
>>>>>>> notifiers is typically more appropriate.
>>>>>> And that was mentioned in the cover letter: cpu hotplug is not yet
>>>>>> handled (hence the RFC status of the series).
>>>>>> When cpu hotplug is being added it might be appropriate to switch the
>>>>>> scheme as you suggested. Right now the current solution is much more
>>>>>> simple.
>>>>> I see (I did notice the cover letter remark, but managed to not
>>>>> honor it when writing the reply), but I'm unconvinced if incurring
>>>>> more code churn by not dealing with things the "dynamic" way
>>>>> right away is indeed the "more simple" (overall) solution.
>>>> Especially with hotplug things are becoming more complicated: I'd like
>>>> to have the final version fall back to smaller granularities in case
>>>> e.g. the user has selected socket scheduling and two sockets have
>>>> different numbers of cores. With hotplug such a situation might be
>>>> discovered only with some domUs already running, so how should we
>>>> react in that case? Doing panic() is no option, so either we reject
>>>> onlining the additional socket, or we adapt by dynamically modifying the
>>>> scheduling granularity. Without that being discussed I don't think it
>>>> makes sense to put a lot effort into a solution which is going to be
>>>> rejected in the end.
>>> Hmm, where's the symmetry requirement coming from? Socket
>>> scheduling should mean as many vCPU-s on one socket as there
>>> are cores * threads; similarly core scheduling (number of threads).
>>> Statically partitioning domains would seem an intermediate step
>>> at best only anyway, as that requires (on average) leaving more
>>> resources (cores/threads) idle than with a dynamic partitioning
>>> model.
>> And that is exactly the purpose of core/socket scheduling. How else
>> would it be possible (in future) to pass through the topology below
>> the scheduling granularity to the guest?
> True. Albeit nevertheless an (at least) unfortunate limitation.
>> And how should it be of any
>> use for fighting security issues due to side channel attacks?
> From Xen's pov all is still fine afaict. It's the lack of (correct)
> topology exposure (as per above) which would make guest
> side mitigation impossible.
>>> As to your specific question how to react: Since bringing online
>>> a full new socket implies bringing online all its cores / threads one
>>> by one anyway, a "too small" socket in your scheme would
>>> simply result in the socket remaining unused until "enough"
>>> cores/threads have appeared. Similarly the socket would go
>>> out of use as soon as one of its cores/threads gets offlined.
>> Yes, this is a possible way to do it. It should be spelled out,
>> though.
>>> Obviously this ends up problematic for the last usable socket.
>> Yes, like today for the last cpu/thread.
> Well, only kind of. It's quite expected that the last thread
> can't be offlined. I'd call it rather unexpected that a random
> thread on the last socket can't be offlined just because each
> other socket also has a single offline thread: There might
> still be hundreds of online threads in this case, after all.

You'd need to offline the related thread in all active guests. Otherwise
(from the guest's point of view) a cpu suddenly disappears.

>>> But with the static partitioning you describe I also can't really
>>> see how "xen-hptool smt-disable" is going to work.
>> It won't work. It just makes no sense to use it with core scheduling
>> active.
> Why not? Disabling HT may be for purposes other than mitigating
> vulnerabilities like L1TF. And the system is in a symmetric state
> at the beginning and end of the entire operation; it's merely
> intermediate state which doesn't fit the expectations you set forth.

It is like bare metal: You can't physically unplug a single thread. This
is possible only for complete sockets.

It would theoretically be possible to have a test whether all guests
have the related cpus offlined in order to offline them in Xen. IMHO
this would be overkill: as an admin you have to decide whether you want
to use core scheduling or you want the ability to switch of SMT on the

You can still boot e.g. with sched-gran=socket and smt=off.

Another possibility would be to make sched-gran and SMT per cpupool.
In that case I'd like to those attributes static at creation time of
the cpupool, though.


