Xen project Mailing List

Re: [Xen-devel] [PATCH v2 08/10] tools/libxc: Rework xc_cpuid_apply_policy() to use {get, set}_cpu_policy()

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Wed, 18 Sep 2019 18:09:59 +0200

Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxx>

Delivery-date: Wed, 18 Sep 2019 16:09:57 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 13.09.2019 21:27, Andrew Cooper wrote: > @@ -1054,3 +446,191 @@ int xc_cpuid_set( > > return rc; > } > + > +int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, > + const uint32_t *featureset, unsigned int > nr_features) > +{ > + int rc; > + xc_dominfo_t di; > + unsigned int i, nr_leaves, nr_msrs; > + xen_cpuid_leaf_t *leaves = NULL; > + struct cpuid_policy *p = NULL; > + uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; > + > + if ( xc_domain_getinfo(xch, domid, 1, &di) != 1 || > + di.domid != domid ) > + { > + ERROR("Failed to obtain d%d info", domid); > + rc = -ESRCH; > + goto out; > + } > + > + rc = xc_get_cpu_policy_size(xch, &nr_leaves, &nr_msrs); > + if ( rc ) > + { > + PERROR("Failed to obtain policy info size"); > + rc = -errno; > + goto out; > + } > + > + rc = -ENOMEM; > + if ( (leaves = calloc(nr_leaves, sizeof(*leaves))) == NULL || > + (p = calloc(1, sizeof(*p))) == NULL ) > + goto out; > + > + nr_msrs = 0; > + rc = xc_get_domain_cpu_policy(xch, domid, &nr_leaves, leaves, > + &nr_msrs, NULL); > + if ( rc ) > + { > + PERROR("Failed to obtain d%d's policy", domid); > + rc = -errno; > + goto out; > + } > + > + rc = x86_cpuid_copy_from_buffer(p, leaves, nr_leaves, > + &err_leaf, &err_subleaf); > + if ( rc ) > + { > + ERROR("Failed to deserialise CPUID (err leaf %#x, subleaf %#x) (%d = > %s)", > + err_leaf, err_subleaf, -rc, strerror(-rc)); > + goto out; > + } > + > + if ( featureset ) > + { > + uint32_t disabled_features[FEATURESET_NR_ENTRIES], > + feat[FEATURESET_NR_ENTRIES] = {}; > + static const uint32_t deep_features[] = INIT_DEEP_FEATURES; > + unsigned int i, b; > + > + /* > + * The user supplied featureset may be shorter or longer than > + * FEATURESET_NR_ENTRIES. Shorter is fine, and we will zero-extend. > + * Longer is fine, so long as it only padded with zeros. > + */ > + unsigned int user_len = min(FEATURESET_NR_ENTRIES + 0u, nr_features); > + > + /* Check for truncated set bits. */ > + rc = -EOPNOTSUPP; > + for ( i = user_len; i < nr_features; ++i ) > + if ( featureset[i] != 0 ) > + goto out; > + > + memcpy(feat, featureset, sizeof(*featureset) * user_len); > + > + /* Disable deep dependencies of disabled features. */ > + for ( i = 0; i < ARRAY_SIZE(disabled_features); ++i ) > + disabled_features[i] = ~feat[i] & deep_features[i]; > + > + for ( b = 0; b < sizeof(disabled_features) * CHAR_BIT; ++b ) > + { > + const uint32_t *dfs; > + > + if ( !test_bit(b, disabled_features) || > + !(dfs = x86_cpuid_lookup_deep_deps(b)) ) > + continue; > + > + for ( i = 0; i < ARRAY_SIZE(disabled_features); ++i ) > + { > + feat[i] &= ~dfs[i]; > + disabled_features[i] &= ~dfs[i]; > + } > + } > + > + cpuid_featureset_to_policy(feat, p); > + } > + > + if ( !di.hvm ) > + { > + uint32_t host_featureset[FEATURESET_NR_ENTRIES] = {}; > + uint32_t len = ARRAY_SIZE(host_featureset); > + > + rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_host, > + &len, host_featureset); > + if ( rc ) > + { > + /* Tolerate "buffer too small", as we've got the bits we need. */ > + if ( errno == ENOBUFS ) > + rc = 0; > + else > + { > + PERROR("Failed to obtain host featureset"); > + rc = -errno; > + goto out; > + } > + } > + > + /* > + * On hardware without CPUID Faulting, PV guests see real topology. > + * As a consequence, they also need to see the host htt/cmp fields. > + */ > + p->basic.htt = test_bit(X86_FEATURE_HTT, host_featureset); > + p->extd.cmp_legacy = test_bit(X86_FEATURE_CMP_LEGACY, > host_featureset); > + } > + else > + { > + /* > + * Topology for HVM guests is entirely controlled by Xen. For now, > we > + * hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT. > + */ > + p->basic.htt = true; > + p->extd.cmp_legacy = false; > + > + p->basic.lppp *= 2; So as I've just learned from investigating the multi-vCPU guest boot issue on Rome, this ... > + switch ( p->x86_vendor ) > + { > + case X86_VENDOR_INTEL: > + for ( i = 0; (p->cache.subleaf[i].type && > + i < ARRAY_SIZE(p->cache.raw)); ++i ) > + { > + p->cache.subleaf[i].cores_per_package = > + (p->cache.subleaf[i].cores_per_package << 1) | 1; ..., this, and ... > + p->cache.subleaf[i].threads_per_cache = 0; > + } > + break; > + > + case X86_VENDOR_AMD: > + case X86_VENDOR_HYGON: > + p->extd.nc = (p->extd.nc << 1) | 1; ... this can overflow (in the first case in particular leading to a value of zero when the initial value was 128). I think it wouldn't be bad if all of these were made saturating operations in this new code, despite this not being an exact equivalent of the old code. I haven't tried out yet whether correcting this in the old code (thus applicable to 4.12 and older) will be enough to fix the issue, but it is certainly part of what's needed. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.