Re: [Xen-devel] L1TF, and future work

On Wed, 2018-08-15 at 14:17 +0100, Andrew Cooper wrote:
> Hello,
> Now that the embargo on XSA-273 is up, we can start publicly
> discussing
> the remaining work do, because there is plenty to do.  In no
> particular
> order...
> [...]
> 5) Core-aware scheduling.  At the moment, Xen will schedule arbitrary
> guest vcpus on arbitrary hyperthreads.  This is bad and wants
> fixing. 
> I'll defer to Dario for further details.
Yes. So, basically, making sure that, if we have hyperthreading, only
vCPUs from one domain are, at any given time, concurrently running on
the threads of a core, acts as a form of mitigation.

As a reference, check how this is mentioned in L1TF writeups coming
from other hypervisor's that have (or are introducing) support for this



(MS' Hyper-V's core-scheduler is also mentioned in one of the Intel's

It's not a *complete* mitigation, and, e.g., the other measures (like
the L1D flushing on VMEnter) are still required, but it helps
preventing the issue of a VM being able to read/steal data from another

As an example, if we have VM 1 and VM 2, with four vCPUs each, and a
two core system with hyperthreading, i.e., cpu 0 and cpu 1 are threads
of core 0, while cpu 2 and cpu 3 are threads of core 2, we want to
schedule the vCPUs, for instance, like this:

cpu0 <-- d2v3
cpu1 <-- d2v1
cpu2 <-- d1v2
cpu3 <-- d1v0

and not like this:

cpu0 <-- d1v2
cpu1 <-- d2v3

Of course, this means that, if only d1v2, from VM 1, is active and
wants to run, while alle the four vCPUs of VM 2 are active and want to
run too, we can end up in this situation:

cpu0 <-- d1v2
cpu1 <-- _idle_
cpu2 <-- d2v1
cpu3 <-- d2v3

wanting_to_run: d2v0, d2v2

I.e., there are ready to run vCPUs, there is an idle pCPU, but we can't
run them there. This is not ideal, but is, at least in theory, better
than disabling hyperthreading entirely. (Again, these are all just

Of course, this makes the scheduling much more complicated, especially
when it comes to fairness considerations and to avoiding starvation.

I do have an RFC level patch series, for starting implementing this
"core-scheduling", which I have shared with someone, during the
embargo, and that I will post here on xen-devel later.

Note that I'll be off for ~2 weeks, effective next Monday, so feel free
to comment, reply, etc, but expect me to reply back only in September.

