[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH v2 1/2] xen/memory : Add a stats_table resource type
On 07.10.2022 14:39, Matias Ezequiel Vara Larsen wrote: > This commit proposes a new mechanism to query the RUNSTATE_running counter for > a given vcpu from a dom0 userspace application. This commit proposes to expose > that counter by using the acquire_resource interface. The current mechanism > relies on the XEN_DOMCTL_getvcpuinfo and holds a single global domctl_lock for > the entire hypercall; and iterate over every vcpu in the system for every > update thus impacting operations that share that lock. > > This commit proposes to expose vcpu RUNSTATE_running via the > xenforeignmemory interface thus preventing to issue the hypercall and holding > the lock. For that purpose, a new resource type named stats_table is added. > The > first frame of this resource stores per-vcpu counters. The frame has one entry > of type struct vcpu_stats per vcpu. The allocation of this frame only happens > if the resource is requested. The frame is released after the domain is > destroyed. > > Note that the updating of this counter is in a hot path, thus, in this commit, > copying only happens if it is specifically required. > > Note that the exposed structure is extensible in two ways. First, the > structure > vcpu_stats can be extended with new per-vcpu counters while it fits in a > frame. I'm afraid I don't see how this is "extensible". I would recommend that you outline for yourself how a change would look like to actually add such a 2nd counter. While doing that keep in mind that whatever changes you make may not break existing consumers. It's also not clear what you mean with "fits in a frame": struct shared_vcpustatspage is a container for an array with a single element. I may guess (looking at just the public interface) that this really is meant to be a flexible array (and hence should be marked as such - see other uses of XEN_FLEX_ARRAY_DIM in the public headers). Yet if that's the case, then a single page already won't suffice for a domain with sufficiently many vCPU-s. > Second, new frames can be added in case new counters are required. Are you talking of "new counters" here which aren't "new per-vcpu counters"? Or else what's the difference from the 1st way? > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -741,6 +741,8 @@ void hvm_domain_relinquish_resources(struct domain *d) > > ioreq_server_destroy_all(d); > > + stats_free_vcpu_mfn(d); How come this lives here? Surely this new feature should be not only guest-type independent, but also arch-agnostic? Clearly you putting the new data in struct domain (and not struct arch_domain or yet deeper in the hierarchy) indicates you may have been meaning to make it so. > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -1078,6 +1078,12 @@ unsigned int ioreq_server_max_frames(const struct > domain *d) > return nr; > } > > +unsigned int stats_table_max_frames(const struct domain *d) > +{ > + /* One frame per 512 vcpus. */ > + return 1; > +} As alluded to earlier already - 1 isn't going to be suitable for arbitrary size domains. (Yes, HVM domains are presently limited to 128 vCPU-s, but as per above this shouldn't be a HVM-only feature.) > @@ -1162,6 +1171,88 @@ static int acquire_vmtrace_buf( > return nr_frames; > } > > +void stats_free_vcpu_mfn(struct domain * d) > +{ > + struct page_info *pg = d->vcpustats_page.pg; > + > + if ( !pg ) > + return; > + > + d->vcpustats_page.pg = NULL; > + > + if ( d->vcpustats_page.va ) > + unmap_domain_page_global(d->vcpustats_page.va); > + > + d->vcpustats_page.va = NULL; We ought to gain UNMAP_DOMAIN_PAGE_GLOBAL() for purposes like this one, paralleling UNMAP_DOMAIN_PAGE(). > + put_page_alloc_ref(pg); > + put_page_and_type(pg); > +} > + > +static int stats_vcpu_alloc_mfn(struct domain *d) > +{ > + struct page_info *pg; > + > + pg = alloc_domheap_page(d, MEMF_no_refcount); > + > + if ( !pg ) > + return -ENOMEM; > + > + if ( !get_page_and_type(pg, d, PGT_writable_page) ) { Style: Brace placement (more elsewhere). > + put_page_alloc_ref(pg); This is not allowed when what you may put is the last reference. See other examples we have in the tree. > + return -ENODATA; > + } > + > + d->vcpustats_page.va = __map_domain_page_global(pg); > + if ( !d->vcpustats_page.va ) > + goto fail; > + > + d->vcpustats_page.pg = pg; > + clear_page(d->vcpustats_page.va); I guess this should happen before you globally announce the address. > + return 1; Functions returning -errno on error want to return 0 on success, unless e.g. multiple success indicators are needed. > +fail: Style: Label indentation. > + put_page_alloc_ref(pg); > + put_page_and_type(pg); > + > + return -ENOMEM; > +} > + > +static int acquire_stats_table(struct domain *d, > + unsigned int id, > + unsigned int frame, > + unsigned int nr_frames, > + xen_pfn_t mfn_list[]) Style: Indentation. > +{ > + mfn_t mfn; > + int rc; > + unsigned int i; > + > + if ( !d ) > + return -ENOENT; > + > + for ( i = 0; i < nr_frames; i++ ) > + { > + switch ( i ) > + { > + case XENMEM_resource_stats_frame_vcpustats: Isn't this supposed to be indexed by "id" (which presently you ignore altogether, which can't be right)? > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -264,6 +264,8 @@ static inline void vcpu_runstate_change( > { > s_time_t delta; > struct sched_unit *unit = v->sched_unit; > + shared_vcpustatspage_t * vcpustats_va; Style: Stray blank (more elsewhere). > @@ -287,6 +289,20 @@ static inline void vcpu_runstate_change( > } > > v->runstate.state = new_state; > + > + vcpustats_va = (shared_vcpustatspage_t*)d->vcpustats_page.va; There should be no need for a cast here. > + if ( vcpustats_va ) > + { > + vcpustats_va->vcpu_info[v->vcpu_id].version = Style: Hard tab. > + version_update_begin(vcpustats_va->vcpu_info[v->vcpu_id].version); > + smp_wmb(); > + memcpy(&vcpustats_va->vcpu_info[v->vcpu_id].runstate_running_time, > + &v->runstate.time[RUNSTATE_running], > + sizeof(v->runstate.time[RUNSTATE_running])); Why memcpy() and not a plain assignment? > + smp_wmb(); > + vcpustats_va->vcpu_info[v->vcpu_id].version = > + version_update_end(vcpustats_va->vcpu_info[v->vcpu_id].version); > + } Overall latching &vcpustats_va->vcpu_info[v->vcpu_id] into a helper variable would likely help readability quite a bit. > --- a/xen/include/public/vcpu.h > +++ b/xen/include/public/vcpu.h > @@ -235,6 +235,22 @@ struct vcpu_register_time_memory_area { > typedef struct vcpu_register_time_memory_area > vcpu_register_time_memory_area_t; > DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t); > > +struct vcpu_stats{ Style: Missing blank. > + /* If the least-significant bit of the version number is set then an > update Style: Comment layout. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |