[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.16] Revert "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents"


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 25 Nov 2021 10:52:12 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=T2Cpb2ok8IZX8zUMO+dRmkhiQVIl+9Ihb7HHfxqLEc8=; b=K79JcQB9ZEi9YMjKgBPoThqEjVw81tZnsED4nKW8fUOBMfQOnMy02EoWFLSN1eYa7Z8deJjydMDmZ5KEITZgIY8I5S7dNoiQxdmAOk1ysg+nZvkadu2G2j7quxE/HKdStIkk1l3MyRJK9SjYRCj0OfTjKTmSvrqyywMLwYKujBizZOUc/x7MXYofu5khICjjsWlAusSa2DpnhnkEigIodez3H8iGQgImSgzDChywuzW7gnIrpWVjvYxCVti6bqD7eZKyJ9vyRzcRtX83x8exqlQIEGBExyZxnR0NpFvTXEXq1BdBS0wn2Uo0eXc7nTWU718lq2o1eWm0rrPynCE+3g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kIJYlCaBs6D/mKbO9EyHE83BwY/6cr/jorj4FNCcUkxplrLiVY0kTHZ7i9Ey8Ind0Mk8Cgk+KE40qIJD9JDToIIVXna7JyU0aCCVAmS7m+B/zqT5/aIHzxFToj+vFwx5PgzMV5fnVALHbfgpOORHxXAQvslI6I/UK6QKOnSc2NmCXxrulP5VRCjjkFYNVT4dO+7/d6o4X3xLgHMzRPrk7qMGl7QWFw0Gsq1yTk1tgGawMicIrwLbrMPXUnsh7WXwX+Femp7Sytaz3zk7VolsLzwpNK/EZzx13Qt7jO4NTUq290I3xLbRHqwJou33qog5gPFAaxDc0jDuuVn5Za9hdg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 25 Nov 2021 09:52:37 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25.11.2021 10:24, Roger Pau Monné wrote:
> On Thu, Nov 25, 2021 at 09:57:31AM +0100, Jan Beulich wrote:
>> On 24.11.2021 22:11, Andrew Cooper wrote:
>>> OSSTest has identified a 3rd regression caused by this change.  Migration
>>> between Xen 4.15 and 4.16 on the nocera pair of machines (AMD Opteron 4133)
>>> fails with:
>>>
>>>   xc: error: Failed to set CPUID policy: leaf 00000000, subleaf ffffffff, 
>>> msr ffffffff (22 = Invalid argument): Internal error
>>>   xc: error: Restore failed (22 = Invalid argument): Internal error
>>>
>>> which is a safety check to prevent resuming the guest when the CPUID data 
>>> has
>>> been truncated.  The problem is caused by shrinking of the max policies, 
>>> which
>>> is an ABI that needs handling compatibly between different versions of Xen.
>>
>> Would you mind pointing me to the flight which has hit this problem? I
>> don't think I've seen any respective failure. I also don't think I've
>> seen any respective discussion on irc, so I really wonder where all
>> this is coming from all of the sudden. It's not like the change in
>> question would have gone in just yesterday.
> 
> It's from a pair of newish boxes that Credativ and Ian where
> attempting to commission yesterday. Since the boxes are not yet in
> production Ian wasn't sure if the issue could be on the testing or
> hardware side, so emailed the details in private for us to provide
> some feedback on the issue. The error is at:
> 
> http://logs.test-lab.xenproject.org/osstest/logs/166296/test-amd64-amd64-migrupgrade/info.html

I see. Quite lucky timing then.

>>> Furthermore, shrinking of the default policies also breaks things in some
>>> cases, because certain cpuid= settings in a VM config file which used to 
>>> have
>>> an effect will now be silently discarded.
>>
>> I'm afraid I don't see what you're talking about here. Could you give
>> an example? Is this about features the flags of which live in the
>> higher leaves, which would have got stripped from the default policies?
>> Which would mean the stripping really should happen on the max policies
>> only (where it may not have much of an effect).
> 
> I think there are two separate issues, which I tried to clarify in my
> reply to this same patch.
> 
> Options set using cpuid= with xl could now be rejected when in
> previous versions they were accepted just fine. That's because the
> shrinking to the policies can now cause find_leaf calls in
> xc_cpuid_xend_policy to fail, and thus the whole operation would
> fail.

Okay, this could be addressed by merely dropping the calls from
calculate_{pv,hvm}_def_policy(). Thinking about it, I can surely
agree they shouldn't have been put there in the first place.
Which would be quite the opposite of your initial proposal, where
you did drop them from calculate_{pv,hvm}_max_policy(). A guest
migrating in with a larger max leaf value should merely have that
max leaf value retained, but that ought to be possible without
dropping the shrinking from calculate_{pv,hvm}_max_policy(). Even
leaving aside migration, I guess an explicit request for a large
max leaf value should be honored; those possibly many trailing
leaves then would simply all be blank.

> There's another likely error that only affects callers of
> xc_cpuid_apply_policy that pass a featureset (so not the upstream
> toolstack), where some leaves of the featureset could now be ignored
> by the guest if the max leaves value doesn't cover them anymore. Note
> this was already an issue even before 540d911c2813, as applying the
> featureset doesn't check that the set feature leaves are below the max
> leaf.

If this was an issue before the commit to be reverted, I take it
the revert isn't going to help it? In which case this information
is interesting, but not applicable as justification for the
revert?

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.