[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/PV32: restore PAE-extended-CR3 logic


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 4 Apr 2023 16:24:16 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NdmOg5mQoRO5Np3j2AZRvRHevCll0fy8DK/TFMocFsA=; b=E+SCMfCT6aC5Q0b+CVSO9gwqnWf2zSL+kNfUO6rAAn+mqt0L/N0tHbOkhcIS2Ys2x9o/bzhxNdF3Yevem2HkD/IVd5EDuUNZk43L8RneFUIIWjFNxoqtAec7vBcl4j9oWkWrXHE4ccrD+zcs8OK1s88hJbu2c0n4ySyZV3z0scqGPVI94h4WUtcUpz8Yl0FazESftpmewUnGWpT6mRZT6nETpTkApnKNqLCFE6SkHXUjCA47yVRK+VfvjXLu5svwVck6x1XB0V+0EPmOtPuYbBq3j2m41AREy8tEgUZ2Fzg0UuEkY0zrtF0o+dW4+c/Gd3gwlXwZ5g2L1Ck7I42MbA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B7tWl/BQS4a33kq+YilzewF/6rtfg/zxjStKj20pmYDYguIOaZ1iddPmlPKiCoBwNrpC57FxPoSmZttNxk1V4kHVK5yFZ9bDs22G6MofM7n5CoH7E+FdJmUwWDPBzIhUigDwwgspBtt63vpq4jA0OwDvLcEN54tRGp47SZYnj9TTa5jzYn9B+1j3vf2iBwEcSzvUrewMurfWf37QXNQ2ljnvT+HEtPLATDNEllPozf5QtqWlF84PfzgwLjFNt6to+8b9yBQoldsqaJqV3Mbo0zILgRr1aQQUUA50zqqgYItwgxOlEtJftJ3yjoQyHTQb5CJDZZ27i2P7FZU0ymU+qw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 04 Apr 2023 14:24:23 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.04.2023 13:41, Roger Pau Monné wrote:
> On Tue, Apr 04, 2023 at 12:31:31PM +0200, Jan Beulich wrote:
>> On 04.04.2023 12:12, Roger Pau Monné wrote:
>>> On Wed, Feb 15, 2023 at 03:54:11PM +0100, Jan Beulich wrote:
>>>> While the PAE-extended-CR3 VM assist is a 32-bit only concept, it still
>>>> applies to guests also when run on a 64-bit hypervisor: The "extended
>>>> CR3" format has to be used there as well, to fit the address in the only
>>>> 32-bit wide register there. As a result it was a mistake that the check
>>>> was never enabled for that case, and was then mistakenly deleted in the
>>>> course of removal of 32-bit-Xen code (218adf199e68 ["x86: We can assume
>>>> CONFIG_PAGING_LEVELS==4"]).
>>>>
>>>> Similarly during Dom0 construction kernel awareness needs to be taken
>>>> into account, and respective code was again mistakenly never enabled for
>>>> 32-bit Dom0 when running on 64-bit Xen (and thus wrongly deleted by
>>>> 5d1181a5ea5e ["xen: Remove x86_32 build target"]).
>>>>
>>>> At the same time restrict enabling of the assist for Dom0 to just the
>>>> 32-bit case. Furthermore there's no need for an atomic update there.
>>>>
>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>>> ---
>>>> I was uncertain whether to add a check to the CR3 guest read path,
>>>> raising e.g. #GP(0) when the value read wouldn't fit but also may not
>>>> be converted to "extended" format (overflow is possible there in
>>>> principle because of the control tools "slack" in promote_l3_table()).
>>>>
>>>> In that context I was puzzled to find no check on the CR3 guest write
>>>> path even in 4.2: A guest (bogusly) setting the PCD or PWT bits (or any
>>>> of the low reserved ones) could observe anomalous behavior rather than
>>>> plain failure.
>>>>
>>>> As to a Fixes: tag - it's pretty unclear which of the many original
>>>> 32-on-64 changes to blame. I don't think the two cited commits should
>>>> be referenced there, as they didn't break anything that wasn't already
>>>> broken.
>>>>
>>>> --- a/xen/arch/x86/mm.c
>>>> +++ b/xen/arch/x86/mm.c
>>>> @@ -1520,6 +1520,23 @@ static int promote_l3_table(struct page_
>>>>      unsigned int   partial_flags = page->partial_flags;
>>>>      l3_pgentry_t   l3e = l3e_empty();
>>>>  
>>>> +    /*
>>>> +     * PAE pgdirs above 4GB are unacceptable if a 32-bit guest does not
>>>> +     * understand the weird 'extended cr3' format for dealing with 
>>>> high-order
>>>> +     * address bits. We cut some slack for control tools (before vcpu0 is
>>>> +     * initialised).
>>>
>>> Don't we then need some check in the vCPU init path to assure that the
>>> cr3 is < 32bits if we allow those to initially be set?
>>>
>>> Or will the initialization unconditionally overwrite any previous cr3
>>> value?
>>
>> That's not the way I understand this "cut some slack". Instead I read it
>> to be meant to cover for the VM-assist bit not being set, yet. Beyond
>> that it is assumed to be tool stack's responsibility to constrain
>> addresses suitably. If it doesn't, it'll simply break the guest. (There
>> is some guessing on my part involved here, as the original introduction
>> of that code didn't further explain things.)
> 
> If it's just the guest that's broken I would think it's fine.  As long
> as such mismatch doesn't cause issues in the hypervisor internal state.
> 
> Did you see a toolstack setting such entries before pae_extended_cr3
> is set?

To be honest - I didn't look. As said in the longer reply to Andrew, I
think it is more logical this way (the page table root already being
validated as an L3 table when vCPU 0 is inititalized, which includes
setting its CR3). Hence even if right now the order was the other way
around (which I doubt it is), I wouldn't want to make impossible to
restore the original ordering again.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.