|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 1/2] tools/xg: Streamline cpu policy serialise/deserialise calls
On 31/05/2024 00:59, Andrew Cooper wrote:
> On 29/05/2024 3:30 pm, Alejandro Vallejo wrote:
>> diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
>> index e01f494b772a..85d56f26537b 100644
>> --- a/tools/include/xenguest.h
>> +++ b/tools/include/xenguest.h
>> @@ -799,15 +799,23 @@ int xc_cpu_policy_set_domain(xc_interface *xch,
>> uint32_t domid,
>> xc_cpu_policy_t *policy);
>>
>> /* Manipulate a policy via architectural representations. */
>> -int xc_cpu_policy_serialise(xc_interface *xch, const xc_cpu_policy_t
>> *policy,
>> - xen_cpuid_leaf_t *leaves, uint32_t *nr_leaves,
>> - xen_msr_entry_t *msrs, uint32_t *nr_msrs);
>> +int xc_cpu_policy_serialise(xc_interface *xch, xc_cpu_policy_t *policy);
>> int xc_cpu_policy_update_cpuid(xc_interface *xch, xc_cpu_policy_t *policy,
>> const xen_cpuid_leaf_t *leaves,
>> uint32_t nr);
>> int xc_cpu_policy_update_msrs(xc_interface *xch, xc_cpu_policy_t *policy,
>> const xen_msr_entry_t *msrs, uint32_t nr);
>>
>> +/*
>> + * Accessors for the serialised forms of the policy. The outputs are
>> pointers
>> + * into the policy object and not fresh allocations, so their lifetimes are
>> tied
>> + * to the policy object itself.
>
> This is far more complicated. See below.
>
>> + */
>> +int xc_cpu_policy_get_leaves(xc_interface *xch, const xc_cpu_policy_t
>> *policy,
>> + const xen_cpuid_leaf_t **leaves, uint32_t *nr);
>> +int xc_cpu_policy_get_msrs(xc_interface *xch, const xc_cpu_policy_t *policy,
>> + const xen_msr_entry_t **msrs, uint32_t *nr);
>> +
>> /* Compatibility calculations. */
>> bool xc_cpu_policy_is_compatible(xc_interface *xch, xc_cpu_policy_t *host,
>> xc_cpu_policy_t *guest);
>> diff --git a/tools/libs/guest/xg_cpuid_x86.c
>> b/tools/libs/guest/xg_cpuid_x86.c
>> index 4453178100ad..6cab5c60bb41 100644
>> --- a/tools/libs/guest/xg_cpuid_x86.c
>> +++ b/tools/libs/guest/xg_cpuid_x86.c
>> @@ -834,14 +834,13 @@ void xc_cpu_policy_destroy(xc_cpu_policy_t *policy)
>> }
>> }
>>
>> -static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy,
>> - unsigned int nr_leaves, unsigned int
>> nr_entries)
>> +static int deserialize_policy(xc_interface *xch, xc_cpu_policy_t *policy)
>> {
>> uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1;
>> int rc;
>>
>> rc = x86_cpuid_copy_from_buffer(&policy->policy, policy->leaves,
>> - nr_leaves, &err_leaf, &err_subleaf);
>> + policy->nr_leaves, &err_leaf,
>> &err_subleaf);
>> if ( rc )
>> {
>> if ( err_leaf != -1 )
>
> Urgh - this is a mess.
That's a fair assessment, yes :)
> (Not your fault, but we really need to think
> twice before continuing.)>
> xc_cpu_policy_serialise() is an exported function and, prior to this
> series, used by external entities to get at the content inside the
> opaque object.>
> deserialize_policy() (Clearly not written by me - Roger?) is a local
> helper. Also it looks wonky in the next patch, although I think that's
> just code movement to avoid a forward declaration?
Correct.
>
> By the end of the series, xc_cpu_policy_serialise() isn't used
> externally, but it's still exported.
Because it's external and I didn't want to break the ABI should it be
used somewhere downstream. Happy to change that though.
>
> But, besides the visibility, there's a second difference...
>
>
>> @@ -851,7 +850,7 @@ static int deserialize_policy(xc_interface *xch,
>> xc_cpu_policy_t *policy,
>> }
>>
>> rc = x86_msr_copy_from_buffer(&policy->policy, policy->msrs,
>> - nr_entries, &err_msr);
>> + policy->nr_msrs, &err_msr);
>> if ( rc )
>> {
>> if ( err_msr != -1 )
>> @@ -878,7 +877,10 @@ int xc_cpu_policy_get_system(xc_interface *xch,
>> unsigned int policy_idx,
>> return rc;
>> }
>>
>> - rc = deserialize_policy(xch, policy, nr_leaves, nr_msrs);
>> + policy->nr_leaves = nr_leaves;
>> + policy->nr_msrs = nr_msrs;
>> +
>> + rc = deserialize_policy(xch, policy);
>
> ... they're asymmetric as to whether the caller or the callee preloads
> policy->nr_*.>
> Both of these need rationalising, one way or another.
>
I'm not sure I follow. Neither the why nor the how.
nr_* assignments are not arbitrary. Deserialising doesn't involve
modifying the number of leaves/MSRs. The cases in which they are set are:
1. New policy is loaded from the hypervisor
* Must be reset to maximum values before hypercall and set to the
outputs of the hypercall after.
2. Policy is serialised
* Number of leaves could increase/decrease
Case (1) must be handled in the hypercall wrappers themselves because
the policy is effectively an output object as a whole then. On case (2)
they must be set by the serialiser itself, as that's the one that knows
the end result.
Could you elaborate in what you mean?
>
> But, there's a related problem.
>
> Previously there was only one canonical form (the deserialised form),
> and anything operating on state was responsible for getting it back to
> being the deserialised form.
That's true. But this tension exists regardless of this patch. Some
parts of the code want to operate on raw data and others on structured
data; and then others on featuresets because 3 forms are better than 2.
The roundtrip through featuresets already breaks apart the idea of a
"canonical" form. Truth is, users of this framework operate on a "I know
what form I'm operating on and what form I must restore it to", if not
by design then by accident.
I don't have a good argument about the fragility of the whole thing
besides it being silly, convoluted and visually noisy to dynamically
allocate a fixed-sized array in order to deserialize things. Plus it's
yet another source of errors when callers have to keep track of their
own dynamic buffers.
>
> Now, there are two forms which are coexist side by side. The buffer
> exposed by get_{cpuid,msr}() is only good until the next operation which
> uses what were (previously) the internal staging buffer(s).
>
> And that makes it a fragile and error prone interface.
3, including the featuresets. And note that dynamically allocating
buffers outside the policy object is very error prone as well. It's a
matter of where we want to have fragility.
Unrelated for what you mean, but I'll put my Rust hat for this and start
preaching; This is the sort of thing Rust's borrow checker can be abused
to avoid lifetime related pitfalls.
Can you think of another way that doesn't involve a copyout?
>> if ( rc )
>> {
>> errno = -rc;
>> @@ -903,7 +905,10 @@ int xc_cpu_policy_get_domain(xc_interface *xch,
>> uint32_t domid,
>> return rc;
>> }
>>
>> - rc = deserialize_policy(xch, policy, nr_leaves, nr_msrs);
>> + policy->nr_leaves = nr_leaves;
>> + policy->nr_msrs = nr_msrs;
>> +
>> + rc = deserialize_policy(xch, policy);
>> if ( rc )
>> {
>> errno = -rc;
>> @@ -917,17 +922,14 @@ int xc_cpu_policy_set_domain(xc_interface *xch,
>> uint32_t domid,
>> xc_cpu_policy_t *policy)
>> {
>> uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1;
>> - unsigned int nr_leaves = ARRAY_SIZE(policy->leaves);
>> - unsigned int nr_msrs = ARRAY_SIZE(policy->msrs);
>> int rc;
>>
>> - rc = xc_cpu_policy_serialise(xch, policy, policy->leaves, &nr_leaves,
>> - policy->msrs, &nr_msrs);
>> + rc = xc_cpu_policy_serialise(xch, policy);
>> if ( rc )
>> return rc;
>>
>> - rc = xc_set_domain_cpu_policy(xch, domid, nr_leaves, policy->leaves,
>> - nr_msrs, policy->msrs,
>> + rc = xc_set_domain_cpu_policy(xch, domid, policy->nr_leaves,
>> policy->leaves,
>> + policy->nr_msrs, policy->msrs,
>> &err_leaf, &err_subleaf, &err_msr);
>> if ( rc )
>> {
>> @@ -942,32 +944,26 @@ int xc_cpu_policy_set_domain(xc_interface *xch,
>> uint32_t domid,
>> return rc;
>> }
>>
>> -int xc_cpu_policy_serialise(xc_interface *xch, const xc_cpu_policy_t *p,
>> - xen_cpuid_leaf_t *leaves, uint32_t *nr_leaves,
>> - xen_msr_entry_t *msrs, uint32_t *nr_msrs)
>> +int xc_cpu_policy_serialise(xc_interface *xch, xc_cpu_policy_t *p)
>> {
>> int rc;
>> + p->nr_leaves = ARRAY_SIZE(p->leaves);
>> + p->nr_msrs = ARRAY_SIZE(p->msrs);
>>
>> - if ( leaves )
>> + rc = x86_cpuid_copy_to_buffer(&p->policy, p->leaves, &p->nr_leaves);
>> + if ( rc )
>> {
>> - rc = x86_cpuid_copy_to_buffer(&p->policy, leaves, nr_leaves);
>> - if ( rc )
>> - {
>> - ERROR("Failed to serialize CPUID policy");
>> - errno = -rc;
>> - return -1;
>> - }
>> + ERROR("Failed to serialize CPUID policy");
>> + errno = -rc;
>> + return -1;
>> }
>>
>> - if ( msrs )
>> + rc = x86_msr_copy_to_buffer(&p->policy, p->msrs, &p->nr_msrs);
>> + if ( rc )
>> {
>> - rc = x86_msr_copy_to_buffer(&p->policy, msrs, nr_msrs);
>> - if ( rc )
>> - {
>> - ERROR("Failed to serialize MSR policy");
>> - errno = -rc;
>> - return -1;
>> - }
>> + ERROR("Failed to serialize MSR policy");
>> + errno = -rc;
>> + return -1;
>> }
>>
>> errno = 0;
>> @@ -1012,6 +1008,42 @@ int xc_cpu_policy_update_msrs(xc_interface *xch,
>> xc_cpu_policy_t *policy,
>> return rc;
>> }
>>
>> +int xc_cpu_policy_get_leaves(xc_interface *xch,
>> + const xc_cpu_policy_t *policy,
>> + const xen_cpuid_leaf_t **leaves,
>> + uint32_t *nr)
>> +{
>> + if ( !policy )
>> + {
>> + ERROR("Failed to fetch CPUID leaves from policy object");
>> + errno = -EINVAL;
>> + return -1;
>> + }
>
> This check isn't useful, and it's making the interface inconsistent.
> There's no case ever where a NULL policy is meaningful, except for the
> very initial failure to allocate, and there it's the return value not an
> input parameter.>
> More importantly however, the error message is misleading as a consequence.
Pretty sure I've been asked before to validate trivial preconditions so
I'm guessing various maintainers have conflicting opinions on how
aggresively to validate?
TL;DR: Sure, happy to rip that out.
>
>> +
>> + *leaves = policy->leaves;
>> + *nr = policy->nr_leaves;
>> +
>> + return 0;
>> +}
>> +
>> +int xc_cpu_policy_get_msrs(xc_interface *xch,
>> + const xc_cpu_policy_t *policy,
>> + const xen_msr_entry_t **msrs,
>> + uint32_t *nr)
>> +{
>> + if ( !policy )
>> + {
>> + ERROR("Failed to fetch MSRs from policy object");
>> + errno = -EINVAL;
>> + return -1;
>> + }
>> +
>> + *msrs = policy->msrs;
>> + *nr = policy->nr_msrs;
>> +
>> + return 0;
>> +}
>> +
>> bool xc_cpu_policy_is_compatible(xc_interface *xch, xc_cpu_policy_t *host,
>> xc_cpu_policy_t *guest)
>> {
>> diff --git a/tools/libs/guest/xg_private.h b/tools/libs/guest/xg_private.h
>> index d73947094f2e..a65dae818f3d 100644
>> --- a/tools/libs/guest/xg_private.h
>> +++ b/tools/libs/guest/xg_private.h
>> @@ -177,6 +177,8 @@ struct xc_cpu_policy {
>> struct cpu_policy policy;
>> xen_cpuid_leaf_t leaves[CPUID_MAX_SERIALISED_LEAVES];
>> xen_msr_entry_t msrs[MSR_MAX_SERIALISED_ENTRIES];
>> + uint32_t nr_leaves;
>> + uint32_t nr_msrs;
>
> These need a comment explaining how they're used, and sadly they have no
> relationship to the lengths of the array. There's a corner case where
> they can end up larger.
The may end up smaller, but they must absolutely never end up larger. If
such a corner case exists, please elaborate because it's a bug I'd like
to have fixed.
>
>> };
>> #endif /* x86 */
>>
>> diff --git a/tools/libs/guest/xg_sr_common_x86.c
>> b/tools/libs/guest/xg_sr_common_x86.c
>> index 563b4f016877..a0d67c3211c6 100644
>> --- a/tools/libs/guest/xg_sr_common_x86.c
>> +++ b/tools/libs/guest/xg_sr_common_x86.c
>> @@ -1,4 +1,5 @@
>> #include "xg_sr_common_x86.h"
>> +#include "xg_sr_stream_format.h"
>
> I'm pretty sure this shouldn't be necessary. Is it?
Indeed. Leftover from previous version
>
>>
>> int write_x86_tsc_info(struct xc_sr_context *ctx)
>> {
>> @@ -45,54 +46,39 @@ int handle_x86_tsc_info(struct xc_sr_context *ctx,
>> struct xc_sr_record *rec)
>> int write_x86_cpu_policy_records(struct xc_sr_context *ctx)
>> {
>> xc_interface *xch = ctx->xch;
>> - struct xc_sr_record cpuid = { .type = REC_TYPE_X86_CPUID_POLICY, };
>> - struct xc_sr_record msrs = { .type = REC_TYPE_X86_MSR_POLICY, };
>> - uint32_t nr_leaves = 0, nr_msrs = 0;
>> - xc_cpu_policy_t *policy = NULL;
>> + xc_cpu_policy_t *policy = xc_cpu_policy_init();
>> int rc;
>>
>> - if ( xc_cpu_policy_get_size(xch, &nr_leaves, &nr_msrs) < 0 )
>> - {
>> - PERROR("Unable to get CPU Policy size");
>> - return -1;
>> - }
>> -
>> - cpuid.data = malloc(nr_leaves * sizeof(xen_cpuid_leaf_t));
>> - msrs.data = malloc(nr_msrs * sizeof(xen_msr_entry_t));
>> - policy = xc_cpu_policy_init();
>> - if ( !cpuid.data || !msrs.data || !policy )
>> - {
>> - ERROR("Cannot allocate memory for CPU Policy");
>> - rc = -1;
>> - goto out;
>> - }
>> -
>> - if ( xc_cpu_policy_get_domain(xch, ctx->domid, policy) )
>> + if ( !policy || xc_cpu_policy_get_domain(xch, ctx->domid, policy) )
>> {
>> PERROR("Unable to get d%d CPU Policy", ctx->domid);
>> rc = -1;
>> goto out;
>> }
>> - if ( xc_cpu_policy_serialise(xch, policy, cpuid.data, &nr_leaves,
>> - msrs.data, &nr_msrs) )
>> - {
>> - PERROR("Unable to serialize d%d CPU Policy", ctx->domid);
>> - rc = -1;
>> - goto out;
>> - }
>
> Wow, the old code here was especially daft.
Can confirm.
>
> We're having Xen serialise the policy, copying (double buffering) into
> the policy object then desensitising. And vs the old copy, we've got
> rid of the re-serialise into yet another buffer.
>
> But we should still be using a plain XEN_DOMCTL_get_cpu_policy here.
> Literally all we want to do is take the array(s) Xen gave us and feed
> them straight into the fd.
>
> deserialising is already a reasonably expensive operation (every
> individual leaf coordinate needs re-range checking), and is only ever
> going to get worse.
I wouldn't go that far. It's definitely on the "don't do it every other
operation", but it's a tiny fraction of the (current) cost of a single
hypercall.
>
> It will probably help to split the changes to
> write_x86_cpu_policy_records() out into a separate patch. It's more
> clear cut and also addresses one of the local vs external issues
> discussed above.
>
>
>>
>> - cpuid.length = nr_leaves * sizeof(xen_cpuid_leaf_t);
>> - if ( cpuid.length )
>> +
>> + if ( policy->nr_leaves )
>> {
>> - rc = write_record(ctx, &cpuid);
>> + struct xc_sr_record record = {
>> + .type = REC_TYPE_X86_CPUID_POLICY,
>> + .data = policy->leaves,
>> + .length = policy->nr_leaves * sizeof(*policy->leaves),
>> + };
>> +
>> + rc = write_record(ctx, &record);
>
> Please keep this name being cpuid. It's more helpful when grepping, and
> it also shrinks the diff.
Ack
>
>> if ( rc )
>> goto out;
>> }
>>
>> - msrs.length = nr_msrs * sizeof(xen_msr_entry_t);
>> - if ( msrs.length )
>> + if ( policy->nr_msrs )
>> {
>> - rc = write_record(ctx, &msrs);
>> + struct xc_sr_record record = {
>> + .type = REC_TYPE_X86_MSR_POLICY,
>> + .data = policy->msrs,
>> + .length = policy->nr_msrs * sizeof(*policy->msrs),
>> + };
>> +
>> + rc = write_record(ctx, &record);
>> if ( rc )
>> goto out;
>> }
>> @@ -100,8 +86,6 @@ int write_x86_cpu_policy_records(struct xc_sr_context
>> *ctx)
>> rc = 0;
>>
>> out:
>> - free(cpuid.data);
>> - free(msrs.data);
>> xc_cpu_policy_destroy(policy);
>>
>> return rc;
>> diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
>> index 4c4593528dfe..488f43378406 100644
>> --- a/tools/misc/xen-cpuid.c
>> +++ b/tools/misc/xen-cpuid.c
>> @@ -156,12 +156,18 @@ static void dump_info(xc_interface *xch, bool detail)
>>
>> free(fs);
>> }
>> -
>
> Stray (deleted) whitespace.
Ack
>
>> -static void print_policy(const char *name,
>> - xen_cpuid_leaf_t *leaves, uint32_t nr_leaves,
>> - xen_msr_entry_t *msrs, uint32_t nr_msrs)
>> +static void print_policy(xc_interface *xch, const char *name,
>> + const xc_cpu_policy_t *policy)
>> {
>> unsigned int l;
>> + const xen_cpuid_leaf_t *leaves;
>> + const xen_msr_entry_t *msrs;
>> + uint32_t nr_leaves, nr_msrs;
>> +
>> + if ( xc_cpu_policy_get_leaves(xch, policy, &leaves, &nr_leaves) )
>> + err(1, "xc_cpu_policy_get_leaves()");
>> + if ( xc_cpu_policy_get_msrs(xch, policy, &msrs, &nr_msrs) )
>> + err(1, "xc_cpu_policy_get_msrs()");
>
> Not an issue with here per say, but to drive home the main problem.
>
> This doesn't return the current leaves/msrs. It gives you whatever's
> stale in the staging buffer, which happens to be ok in xen-cpuid because
> it only ever reads a policy...
>
> ~Andrew
It's not stale if the deserialized policy is not dirty. The alternative
is a copyout (what happened before) which was way more brittle and
involved more hypercalls.
As I mentioned before, I you can think of a better scheme, I'm happy to
consider it.
Cheers,
Alejandro
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |