[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 1/6] x86: detect and initialize Cache QoS Monitoring feature



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Monday, March 31, 2014 10:22 PM
> To: Xu, Dongxiao
> Cc: andrew.cooper3@xxxxxxxxxx; Ian.Campbell@xxxxxxxxxx;
> Ian.Jackson@xxxxxxxxxxxxx; stefano.stabellini@xxxxxxxxxxxxx;
> xen-devel@xxxxxxxxxxxxx; konrad.wilk@xxxxxxxxxx; dgdegra@xxxxxxxxxxxxx;
> keir@xxxxxxx
> Subject: Re: [PATCH v10 1/6] x86: detect and initialize Cache QoS Monitoring
> feature
> 
> >>> On 26.03.14 at 07:35, <dongxiao.xu@xxxxxxxxx> wrote:
> > Detect platform QoS feature status and enumerate the resource types,
> > one of which is to monitor the L3 cache occupancy.
> >
> > Also introduce a Xen grub command line parameter to control the
> > QoS feature status.
> >
> > Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> 
> Is that really correct considering the re-work?

Will remove this.

> 
> > +static int cqm_add_socket(int socket)
> 
> unsigned int.

This function returns negative return values in error case.

> 
> > +{
> > +    unsigned int i, order;
> > +
> > +    if ( cqm->l3c[socket] )
> > +        return 0;
> > +
> > +    /* Allocate per-socket CQM LLC buffer */
> > +    order = get_order_from_bytes((cqm->rmid_max + 1) * sizeof(unsigned
> long));
> > +    cqm->l3c[socket] = alloc_xenheap_pages(order, 0);
> 
> This is not in an __init function, hence you should try hard to avoid
> order-non-0 allocations (I think I said this in reply to an earlier revision
> already). Nor is it really clear to me why you need Xen heap pages
> here.

Previously in v9 discussion, we (Andrew, you and me) agreed on an approach to 
allocate per-socket CQM LLC buffer.
Here we need to allocate the CQM LLC buffer for a new added processor to an 
empty socket.
Yes, I agree those non-zero order allocation should try best to avoided, 
actually the above condition only triggers when one physical processor is hot 
added, I think it is rare. Besides, even this allocation is failed, it will not 
impact others. (the cpu_callback() function will always return NOTIFY_DONE).

> 
> > +static int cpu_callback(
> > +    struct notifier_block *nfb, unsigned long action, void *hcpu)
> > +{
> > +    unsigned int cpu = (unsigned long)hcpu;
> > +    int socket = cpu_to_socket(cpu);
> > +
> > +    if ( socket < 0 )
> > +        return NOTIFY_DONE;
> > +
> > +    switch ( action )
> > +    {
> > +    case CPU_ONLINE:
> > +        cqm_add_socket(socket);
> > +        break;
> > +    case CPU_DEAD:
> > +        if ( !cpumask_weight(per_cpu(cpu_core_mask, cpu)) )
> > +            cqm_remove_socket(socket);
> > +        break;
> > +    default:
> > +        break;
> 
> Pointless default case.

I will remove this default in my code.
Need to mention that, there are such "default case" in most of cpu_callback() 
functions in current Xen code base.

> 
> > +void __init init_cqm(unsigned int rmid_max, unsigned long rmid_mask)
> > +{
> > +    unsigned int rmid, cpu, socket;
> > +    unsigned int eax, edx;
> > +    unsigned int i, order = 0;
> > +
> > +    if ( !rmid_max )
> > +        return;
> > +
> > +    cqm = xzalloc(struct pqos_cqm);
> > +    if ( !cqm )
> > +        return;
> > +
> > +    cpuid_count(0xf, 1, &eax, &cqm->upscaling_factor, &cqm->rmid_max,
> &edx);
> > +    if ( !(edx & QOS_MONITOR_EVTID_L3) )
> > +        goto out;
> > +
> > +    cqm->rmid_mask = rmid_mask;
> > +    cqm->rmid_inuse = 0;
> > +    cqm->rmid_min = 1;
> > +    cqm->rmid_max = min(rmid_max, cqm->rmid_max);
> > +
> > +    spin_lock_init(&cqm->cqm_lock);
> > +
> > +    /* According to Intel SDM, the possible maximum rmid number is 2^10 =
> 1024,
> > +     * thus one page is enough to hold cqm->rmid_to_dom structure */
> > +    cqm->rmid_to_dom = alloc_xenheap_page();
> 
> But please validate that this condition is being met (not necessarily here,
> perhaps rather where rmid_max gets determined).

Okay, will add one ASSERT() to validate this condition.

> 
> > +    if ( !cqm->rmid_to_dom )
> > +        goto out;
> > +    /* Reserve RMID 0 for all domains not being monitored */
> > +    cqm->rmid_to_dom[0] = DOMID_XEN;
> > +    for ( rmid = 1; rmid < PAGE_SIZE/sizeof(domid_t); rmid++ )
> > +        cqm->rmid_to_dom[rmid] = DOMID_INVALID;
> > +    /* Dom0 tool stack needs to know the RMID to DOMID mapping */
> > +    share_xen_page_with_privileged_guests(
> > +        virt_to_page((void *)cqm->rmid_to_dom), XENSHARE_readonly);
> 
> Do you really need this bogus cast?

Will remove it.

> 
> > +
> > +    /* Allocate the buffer that holds MFNs of per-socket CQM LLC */
> > +    order = get_order_from_bytes(sizeof(unsigned long) * NR_CPUS);
> > +    cqm->socket_l3c_mfn = alloc_xenheap_pages(order, 0);
> 
> You index this buffer by socket ID, yet socket ID values aren't
> necessarily linear/contiguous, and hence I don't think there are
> really any guarantees that the ID would be less than NR_CPUS. For
> the n-th time: I don't think you'll get around doing this properly, and
> I still think sufficient data should be retrievable from ACPI to
> determine a proper upper bound here.

I think we need to assume socket ID is less than NR_CPUS, otherwise many of 
current Xen's code is wrong, say credit scheduler 2.
xen/common/sched_credit2.c: init_pcpu()
        rqi = cpu_to_socket(cpu);
        rqd=prv->rqd + rqi;
Here the priv->rqd is defined as:
        struct csched_runqueue_data rqd[NR_CPUS];

For getting socket info from ACPI, do you mean ACPI MADT table?
It can enumerate the existing APIC structures, which based on the fact that the 
CPU is already plugged in the socket.
Per my understanding, I don't think it can enumerate empty CPU sockets.

Can you help to elaborate more about how to get all socket number from ACPI?

> 
> > +    if ( !cqm->socket_l3c_mfn )
> > +        goto out;
> > +    memset(cqm->socket_l3c_mfn, 0, PAGE_SIZE * (1 << order));
> 
> PAGE_SIZE << order

Okay.

Thanks,
Dongxiao

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.