[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/5] public/sysctl: Clarify usage of pm_{px,cx}_stat



On Tue, May 13, 2025 at 3:43 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 12.05.2025 16:46, Ross Lagerwall wrote:
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -215,23 +215,51 @@ typedef struct pm_px_val pm_px_val_t;
> >  DEFINE_XEN_GUEST_HANDLE(pm_px_val_t);
> >
> >  struct pm_px_stat {
> > -    uint8_t total;        /* total Px states */
> > -    uint8_t usable;       /* usable Px states */
> > -    uint8_t last;         /* last Px state */
> > -    uint8_t cur;          /* current Px state */
> > -    XEN_GUEST_HANDLE_64(uint64) trans_pt;   /* Px transition table */
> > +    /*
> > +     * IN: Number of elements in pt, number of rows/columns in trans_pt
> > +     *     (PMSTAT_get_pxstat)
> > +     * OUT: total Px states (PMSTAT_get_max_px, PMSTAT_get_pxstat)
> > +     */
> > +    uint8_t total;
>
> The part for this field ought to go in patch 1, as already indicated there.
>
> > +    uint8_t usable;       /* OUT: usable Px states (PMSTAT_get_pxstat) */
> > +    uint8_t last;         /* OUT: last Px state (PMSTAT_get_pxstat) */
> > +    uint8_t cur;          /* OUT: current Px state (PMSTAT_get_pxstat) */
> > +    /*
> > +     * OUT: Px transition table. This should have total * total elements.
> > +     *      This will not be copied if it is smaller than the hypervisor's
> > +     *      Px transition table. (PMSTAT_get_pxstat)
> > +     */
> > +    XEN_GUEST_HANDLE_64(uint64) trans_pt;
> > +    /* OUT: This should have total elements (PMSTAT_get_pxstat) */
> >      XEN_GUEST_HANDLE_64(pm_px_val_t) pt;
>
> As also indicated there, the same constraint as for trans_pt applies to this
> output buffer, just that it's having only one dimension.
>
> >  };
> >
> >  struct pm_cx_stat {
> > -    uint32_t nr;    /* entry nr in triggers & residencies, including C0 */
> > -    uint32_t last;  /* last Cx state */
> > -    uint64_aligned_t idle_time;                 /* idle time from boot */
> > -    XEN_GUEST_HANDLE_64(uint64) triggers;    /* Cx trigger counts */
> > -    XEN_GUEST_HANDLE_64(uint64) residencies; /* Cx residencies */
> > -    uint32_t nr_pc;                          /* entry nr in pc[] */
> > -    uint32_t nr_cc;                          /* entry nr in cc[] */
> >      /*
> > +     * IN:  Number of elements in triggers, residencies (PMSTAT_get_cxstat)
> > +     * OUT: entry nr in triggers & residencies, including C0
> > +     *      (PMSTAT_get_cxstat, PMSTAT_get_max_cx)
> > +     */
> > +    uint32_t nr;
> > +    uint32_t last;  /* OUT: last Cx state (PMSTAT_get_cxstat) */
> > +    /* OUT: idle time from boot (PMSTAT_get_cxstat)*/
> > +    uint64_aligned_t idle_time;
> > +    /* OUT: Cx trigger counts, nr elements (PMSTAT_get_cxstat) */
> > +    XEN_GUEST_HANDLE_64(uint64) triggers;
> > +    /* OUT: Cx residencies, nr elements (PMSTAT_get_cxstat) */
> > +    XEN_GUEST_HANDLE_64(uint64) residencies;
> > +    /*
> > +     * IN: entry nr in pc[] (PMSTAT_get_cxstat)
> > +     * OUT: Index of highest non-zero entry set in pc[] (PMSTAT_get_cxstat)
> > +     */
> > +    uint32_t nr_pc;
> > +    /*
> > +     * IN: entry nr in cc[] (PMSTAT_get_cxstat)
> > +     * OUT: Index of highest non-zero entry set in cc[] (PMSTAT_get_cxstat)
> > +     */
>
> For both of these, it's not "highest non-zero" but, according to ...
>
> > +    uint32_t nr_cc;
> > +    /*
> > +     * OUT: (PMSTAT_get_cxstat)
> >       * These two arrays may (and generally will) have unused slots; slots 
> > not
> >       * having a corresponding hardware register will not be written by the
> >       * hypervisor. It is therefore up to the caller to put a suitable 
> > sentinel
>
> ... this comment, "highest written by hypervisor". They're also not "index 
> of",
> but "one higher than the index of" (i.e. counts, not indexes).
>

Looking at this again, I don't think that matches what Xen does (nor
does my previous attempt). The code in question:

#define PUT_xC(what, n) do { \
        if ( stat->nr_##what >= n && \
             copy_to_guest_offset(stat->what, n - 1, &hw_res.what##n, 1) ) \
            return -EFAULT; \
        if ( hw_res.what##n ) \
            nr_##what = n; \
    } while ( 0 )

Xen will copy all the hardware registers that it knows about (regardless
of whether the hardware actually has them) and will return in nr_pc /
nr_cc the index + 1 of the highest non-zero entry it _would_ have
written if there is sufficient space.

I could describe it simply as "OUT: Required size of cc[]" ?

Thanks,
Ross



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.