[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 5/5] tools, docs: add total/local memory bandwith monitoring



On Wed, 2015-01-28 at 16:04 +0800, Chao Peng wrote:
> @@ -1476,6 +1483,8 @@ detach: Detach the platform shared resource monitoring 
> service from a domain.
>  Show monitoring data for a certain domain or all domains. Current supported
>  monitor types are:
>   - "cache-occupancy": showing the L3 cache occupancy.
> + - "total-mem-bandwidth": showing the total memory bandwidth.
> + - "local-mem-bandwidth": showing the local memory bandwidth.

Some mention of the units might be useful here (and I suspect elsewhere
in the interface too).

> @@ -156,11 +158,12 @@ int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, 
> uint32_t cpu,
>  }
>  
>  int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu,
> -                        xc_psr_cmt_type type, uint64_t *monitor_data)
> +                        xc_psr_cmt_type type, uint64_t *monitor_data,
> +                        uint64_t *tsc)
>  {
>      xc_resource_op_t op;
> -    xc_resource_entry_t entries[2];
> -    uint32_t evtid;
> +    xc_resource_entry_t entries[3];
> +    uint32_t evtid, nr_entries;
>      int rc;
>  
>      switch ( type )
> @@ -168,6 +171,12 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t 
> rmid, uint32_t cpu,
>      case XC_PSR_CMT_L3_OCCUPANCY:
>          evtid = EVTID_L3_OCCUPANCY;
>          break;
> +    case XC_PSR_CMT_TOTAL_MEM_BANDWIDTH:
> +        evtid = EVTID_TOTAL_MEM_BANDWIDTH;
> +        break;
> +    case XC_PSR_CMT_LOCAL_MEM_BANDWIDTH:
> +        evtid = EVTID_LOCAL_MEM_BANDWIDTH;
> +        break;
>      default:
>          return -1;
>      }
> @@ -182,19 +191,35 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t 
> rmid, uint32_t cpu,
>      entries[1].val = 0;
>      entries[1].rsvd = 0;
>  
> +    if ( type == XC_PSR_CMT_L3_OCCUPANCY )

Please use a switch.

Is there no reason to want the TSC for L3_OCCUPANCY? Perhaps a simpler
interface would be to always request the TSC if the tsc argument is
non-null?

> +        nr_entries = 2;
> +    else
> +    {
> +        entries[2].u.cmd = XEN_RESOURCE_OP_MSR_READ;
> +        entries[2].idx = MSR_IA32_TSC;
> +        entries[2].val = 0;
> +        entries[2].rsvd = 0;
> +
> +        nr_entries = 3;
> +    }
> +
>      op.cpu = cpu;
> -    op.nr_entries = 2;
> +    op.nr_entries = nr_entries;
>      op.entries = entries;
>  
>      rc = xc_resource_op(xch, 1, &op);
>      if ( rc < 0 )
>          return rc;
>  
> -    if ( op.result !=2 || entries[1].val & IA32_CMT_CTR_ERROR_MASK )
> +    if ( op.result != nr_entries || entries[1].val & IA32_CMT_CTR_ERROR_MASK 
> )
>          return -1;
>  
>      *monitor_data = entries[1].val;
>  
> +    if ( type == XC_PSR_CMT_TOTAL_MEM_BANDWIDTH ||
> +         type == XC_PSR_CMT_LOCAL_MEM_BANDWIDTH )
> +        *tsc = entries[2].val;

Are there going to be many more of these hardcoded array indexes as new
features are added?

Perhaps it should be 
        entries[nr].foo = whatever;
        entries[nr].bar = another;
        nr++

And for the TSC one, tsc_entry = &entries[nr], with tsc_entry being NULL
otherwise so you can check it.



> +
>      return 0;
>  }
>  
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 596d2a0..347ef52 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -1462,6 +1462,14 @@ int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
>                                        uint32_t domid,
>                                        uint32_t socketid,
>                                        uint32_t *l3_cache_occupancy);
> +int libxl_psr_cmt_get_total_mem_bandwidth(libxl_ctx *ctx,
> +                                          uint32_t domid,
> +                                          uint32_t socketid,
> +                                          uint32_t *bandwidth);
> +int libxl_psr_cmt_get_local_mem_bandwidth(libxl_ctx *ctx,
> +                                          uint32_t domid,
> +                                          uint32_t socketid,
> +                                          uint32_t *bandwidth);

Needs #define LIBXL_HAVE_FOO (one for both sets of bandwidth should be
OK).

What are the units of bandwidth? Is a 32-bit number sufficient to cover
future improvements to the hardware?
>  
> +static int libxl__psr_cmt_get_mem_bandwidth(libxl__gc *gc,
> +                                            uint32_t domid,
> +                                            xc_psr_cmt_type type,
> +                                            uint32_t socketid,
> +                                            uint32_t *bandwidth)
> +{
> +    uint64_t sample1, sample2;
> +    uint64_t tsc1, tsc2;
> +    uint32_t upscaling_factor;
> +    int retry_attempts = 0;
> +    int rc;
> +
> +    while (1) {
> +        rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid, type, socketid,
> +                                                   &sample1, &tsc1);
> +        if (rc < 0) {
> +            rc = ERROR_FAIL;
> +            goto out;
> +        }
> +
> +        usleep(10000);

I'm afraid that sleeping in libxl in this way isn't acceptable,
especially for such a long duration.

Any function which does this certainly meets the definition of "slow" in
libxl_internal.h and therefore really ought to be using the ao
machinery.

If that isn't possible/desirable for some reason then it needs to be
made very clear to the caller, through at a minimum some very extensive
comments in the interface header what the constraints and implications
of using this function are.

Another alternative would be to expose the instantaneous values to the
libxl user and let it sleep as it wishes and calculate the bandwidth
itself.

> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 1214d2e..46d160e 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -694,4 +694,6 @@ libxl_event = Struct("event",[
>  
>  libxl_psr_cmt_type = Enumeration("psr_cmt_type", [
>      (1, "CACHE_OCCUPANCY"),
> +    (2, "TOTAL_MEM_BANDWIDTH"),
> +    (3, "LOCAL_MEM_BANDWIDTH"),

I'm really starting to wonder if wedging all of these logically
unrelated statistics into an umbrella psr interface at the libxl level
makes sense.

But now I notice that this type isn't even used by libxl, it seems to be
xl internal. If so then please can you arrange to move it to a new
xl_types.idl and add these new ones there.

I think we can get away with deprecating the libxl variant pretty
aggressively since it's not currently all that useful.

>      ])
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 1827c63..6d5c7af 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -7829,6 +7829,16 @@ static void psr_cmt_print_domain_l3_info(libxl_dominfo 
> *dominfo,

I wondered this during the refactoring in an earlier patch, but why is
"l3 info" a suitable name for this function, seems that only one of the
three things is related to l3 caches.

Does it really make sense to wedge these things (cache use and
bandwidth) into a common function?

What are the upcoming future PSR events and how will they fit in? (this
is less critical at the xl level, but it feeds into my thinking about
libxl interfaces too)

>                                                     socketid, &data))
>                  printf("%13u KB", data);
>              break;
> +        case LIBXL_PSR_CMT_TYPE_TOTAL_MEM_BANDWIDTH:
> +            if (!libxl_psr_cmt_get_total_mem_bandwidth(ctx, dominfo->domid,
> +                                                       socketid, &data))
> +                printf("%11u KB/s", data);
> +            break;
> +        case LIBXL_PSR_CMT_TYPE_LOCAL_MEM_BANDWIDTH:
> +            if (!libxl_psr_cmt_get_local_mem_bandwidth(ctx, dominfo->domid,
> +                                                       socketid, &data))
> +                printf("%11u KB/s", data);
> +            break;
>          default:
>              return;
>          }
> @@ -7840,7 +7850,7 @@ static void psr_cmt_print_domain_l3_info(libxl_dominfo 
> *dominfo,
>  static int psr_cmt_show_l3_info(libxl_psr_cmt_type type, uint32_t domid)
>  {
>      uint32_t i, socketid, nr_sockets, total_rmid;
> -    uint32_t l3_cache_size;
> +    uint32_t l3_cache_size, l3_event_mask;
>      libxl_physinfo info;
>      int rc, nr_domains;
>  
> @@ -7849,6 +7859,13 @@ static int psr_cmt_show_l3_info(libxl_psr_cmt_type 
> type, uint32_t domid)
>          return -1;
>      }
>  
> +    rc = libxl_psr_cmt_get_l3_event_mask(ctx, &l3_event_mask);
> +    if (rc < 0 || !(l3_event_mask & (1 << (type - 1)))) {

I'm not mad keen on this. I'd prefer either 3 predicate functions, one
for each mask type or for a single one which takes a LIBXL_PSR_CMT_TYPE
option.

Is this concept of the PSR l3 even mask already exposed to users of
libxl anywhere before this series?

> diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
> index 4b30d3d..2d8f272 100644
> --- a/tools/libxl/xl_cmdtable.c
> +++ b/tools/libxl/xl_cmdtable.c
> @@ -538,7 +538,9 @@ struct cmd_spec cmd_table[] = {
>        "Show Cache Monitoring Technology information",
>        "<PSR-CMT-Type> <Domain>",
>        "Available monitor types:\n"
> -      "\"cache_occupancy\":         Show L3 cache occupancy\n",
> +      "\"cache_occupancy\":         Show L3 cache occupancy\n"
> +      "\"total_mem_bandwidth\":     Show total memory bandwidth\n"
> +      "\"local_mem_bandwidth\":     Show local memory bandwidth\n",
>      },
>  #endif
>  };



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.