[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC V2 45/45] xen/sched: add scheduling granularity enum

  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Mon, 6 May 2019 15:29:24 +0200
  • Autocrypt: addr=jgross@xxxxxxxx; prefer-encrypt=mutual; keydata= mQENBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAG0H0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT6JATkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPuQENBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAGJAR8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHf4kBrQQY AQgAIBYhBIUSZ3Lo9gSUpdCX97DendYovxMvBQJa3fDQAhsCAIEJELDendYovxMvdiAEGRYI AB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCWt3w0AAKCRCAXGG7T9hjvk2LAP99B/9FenK/ 1lfifxQmsoOrjbZtzCS6OKxPqOLHaY47BgEAqKKn36YAPpbk09d2GTVetoQJwiylx/Z9/mQI CUbQMg1pNQf9EjA1bNcMbnzJCgt0P9Q9wWCLwZa01SnQWFz8Z4HEaKldie+5bHBL5CzVBrLv 81tqX+/j95llpazzCXZW2sdNL3r8gXqrajSox7LR2rYDGdltAhQuISd2BHrbkQVEWD4hs7iV 1KQHe2uwXbKlguKPhk5ubZxqwsg/uIHw0qZDk+d0vxjTtO2JD5Jv/CeDgaBX4Emgp0NYs8IC UIyKXBtnzwiNv4cX9qKlz2Gyq9b+GdcLYZqMlIBjdCz0yJvgeb3WPNsCOanvbjelDhskx9gd 6YUUFFqgsLtrKpCNyy203a58g2WosU9k9H+LcheS37Ph2vMVTISMszW9W8gyORSgmw==
  • Cc: Tim Deegan <tim@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 06 May 2019 13:29:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 06/05/2019 15:14, Jan Beulich wrote:
>>>> On 06.05.19 at 14:23, <jgross@xxxxxxxx> wrote:
>> On 06/05/2019 13:58, Jan Beulich wrote:
>>>>>> On 06.05.19 at 12:20, <jgross@xxxxxxxx> wrote:
>>>> On 06/05/2019 12:01, Jan Beulich wrote:
>>>>>>>> On 06.05.19 at 11:23, <jgross@xxxxxxxx> wrote:
>>>>>> On 06/05/2019 10:57, Jan Beulich wrote:
>>>>>>> . Yet then I'm a little puzzled by its use here in the first place.
>>>>>>> Generally I think for_each_cpu() uses in __init functions are
>>>>>>> problematic, as they then require further code elsewhere to
>>>>>>> deal with hot-onlining. A pre-SMP-initcall plus use of CPU
>>>>>>> notifiers is typically more appropriate.
>>>>>> And that was mentioned in the cover letter: cpu hotplug is not yet
>>>>>> handled (hence the RFC status of the series).
>>>>>> When cpu hotplug is being added it might be appropriate to switch the
>>>>>> scheme as you suggested. Right now the current solution is much more
>>>>>> simple.
>>>>> I see (I did notice the cover letter remark, but managed to not
>>>>> honor it when writing the reply), but I'm unconvinced if incurring
>>>>> more code churn by not dealing with things the "dynamic" way
>>>>> right away is indeed the "more simple" (overall) solution.
>>>> Especially with hotplug things are becoming more complicated: I'd like
>>>> to have the final version fall back to smaller granularities in case
>>>> e.g. the user has selected socket scheduling and two sockets have
>>>> different numbers of cores. With hotplug such a situation might be
>>>> discovered only with some domUs already running, so how should we
>>>> react in that case? Doing panic() is no option, so either we reject
>>>> onlining the additional socket, or we adapt by dynamically modifying the
>>>> scheduling granularity. Without that being discussed I don't think it
>>>> makes sense to put a lot effort into a solution which is going to be
>>>> rejected in the end.
>>> Hmm, where's the symmetry requirement coming from? Socket
>>> scheduling should mean as many vCPU-s on one socket as there
>>> are cores * threads; similarly core scheduling (number of threads).
>>> Statically partitioning domains would seem an intermediate step
>>> at best only anyway, as that requires (on average) leaving more
>>> resources (cores/threads) idle than with a dynamic partitioning
>>> model.
>> And that is exactly the purpose of core/socket scheduling. How else
>> would it be possible (in future) to pass through the topology below
>> the scheduling granularity to the guest?
> True. Albeit nevertheless an (at least) unfortunate limitation.
>> And how should it be of any
>> use for fighting security issues due to side channel attacks?
> From Xen's pov all is still fine afaict. It's the lack of (correct)
> topology exposure (as per above) which would make guest
> side mitigation impossible.
>>> As to your specific question how to react: Since bringing online
>>> a full new socket implies bringing online all its cores / threads one
>>> by one anyway, a "too small" socket in your scheme would
>>> simply result in the socket remaining unused until "enough"
>>> cores/threads have appeared. Similarly the socket would go
>>> out of use as soon as one of its cores/threads gets offlined.
>> Yes, this is a possible way to do it. It should be spelled out,
>> though.
>>> Obviously this ends up problematic for the last usable socket.
>> Yes, like today for the last cpu/thread.
> Well, only kind of. It's quite expected that the last thread
> can't be offlined. I'd call it rather unexpected that a random
> thread on the last socket can't be offlined just because each
> other socket also has a single offline thread: There might
> still be hundreds of online threads in this case, after all.

You'd need to offline the related thread in all active guests. Otherwise
(from the guest's point of view) a cpu suddenly disappears.

>>> But with the static partitioning you describe I also can't really
>>> see how "xen-hptool smt-disable" is going to work.
>> It won't work. It just makes no sense to use it with core scheduling
>> active.
> Why not? Disabling HT may be for purposes other than mitigating
> vulnerabilities like L1TF. And the system is in a symmetric state
> at the beginning and end of the entire operation; it's merely
> intermediate state which doesn't fit the expectations you set forth.

It is like bare metal: You can't physically unplug a single thread. This
is possible only for complete sockets.

It would theoretically be possible to have a test whether all guests
have the related cpus offlined in order to offline them in Xen. IMHO
this would be overkill: as an admin you have to decide whether you want
to use core scheduling or you want the ability to switch of SMT on the

You can still boot e.g. with sched-gran=socket and smt=off.

Another possibility would be to make sched-gran and SMT per cpupool.
In that case I'd like to those attributes static at creation time of
the cpupool, though.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.