[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Design Doc] Intel L2 Cache Allocation Technology (L2 CAT) Feature enabling



On Fri, May 13, 2016 at 06:17:53PM +0200, Dario Faggioli wrote:
> On Fri, 2016-05-13 at 10:23 +0100, Andrew Cooper wrote:
> > On 13/05/16 09:55, Jan Beulich wrote:
> > > 
> > > But anyway, L2 or L3 - I can't see how this context switching would
> > > DTRT when there are vCPU-s of different domains on the same
> > > socket (or core, if L2s and MSRs were per-core): The one getting
> > > scheduled later onto a socket (core) would simply overwrite what
> > > got written for the one which had been scheduled earlier.
> > PSR_ASSOC is a per-thread MSR which selects the CLOS to use.  CLOS is
> > currently managed per-domain in Xen, and context switched with vcpu.
> > 
> Yep, exactly. I did look a bit into this for CMT (so, not L3 CAT, but
> it's not that different).
> 
> Doing things on a per-vcpu basis could be interesting, and that's even
> more the case if we get to do L2 stuff, but there are two few RMIDs
> available for such a configuration to be really useful.
> 
> > Xen programs different capacity bitmaps into IA32_L2_QOS_MASK_0 ...
> > IA32_L2_QOS_MASK_n, and the CLOS selects which bitmap is enforced.
> > 
> So, basically, just to figure out if I understood (i.e., this is for He
> Chen).
> 
> If we have a 2 sockets, with socket 0 containing cores 0,1,2,3 and
> socket 1 containing cores 4,5,6,7, it will be possible to specify two
> different "L2 reservation values" (in the form of CBMs, of course), for
> a domain:
>  - one would be how much L2 cache the domain will be able to use (say 
>    X) when running on socket 1, which means on cores 0,1,2 or 3
>  - another would be how much L2 cache the domain will be able to (say, 
>    Y use when running on socket 2, which means on cores 4,5,6, or 7
> 
> Which in turn means, in case L2 is per core, the domain will get X of
> core 0's L2, X of core 1's L2, X of core 2's L2 and X of core 3's L2.
> On socket 1, it will get Y of core 4' L2, Y of core 5's L2 cache etc.
> etc.
> 
> And so, in summary what we would not be able to specify is a different
> value for the L2 reservations of, for instance, core 1 and core 3
> (i.e., of cores that are part of the same socket).
> 
> Does this summary make sense?

Yes, great instance and that is exactly how L3 CAT work now.
Let's see the source to make it clear:
```
void psr_ctxt_switch_to(struct domain *d)
{
    ...
    if ( psra->cos_mask )
        psr_assoc_cos(&reg, d->arch.psr_cos_ids ?
                      d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
                      0, psra->cos_mask);
    ...
}
```
`psr_cos_ids` is indexed by socket_id which leads to the per-socket
cache enforcement.

As Andrew said, CLOS is currently managed per-domain in Xen and it works
well so far. So in initial design, I am inclined to continue this
behavior (per-socket) to L2 CAT to keep the consistency between L2 and
L3 CAT. Any thoughts?

Thanks,
-He


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.