[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/kexec: Fix crash on transition to a 32bit kernel on AMD hardware


  • To: Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 1 Nov 2021 11:10:35 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=L9YUjeiSixQ1cD1QcNB4+h2kZ4Rz0DzTeL3ImiJc608=; b=OVVicLFLksS8sCYMWX3y7OxLSDgTOnaBPnDitVsUlXRMGeVOGE4zTz0bhjEqtNC3ZydSAqEoE1DhylRm/fX7u6kzy9DEYsEAUvnJ8OCagE2c6tEl/tHvJL3+nT7o/LTR3tNGi0N+fdMOAfplfiruWSg7sBVd2+Y6P6ZzOuTAHukRP05n0UcQoGKn1L+4+BWVcTOT3ei/h4ecZd3p5RdcYXCN+lc40OJdPaQdp9wNqt3SHkXIMc1BxgebKmUJfkHYq2ia19qOcYeNQ+9+eXIvtQBgAAMt5xvTxVgW5gLkNjft4Z9vhpnPilCkwNlsGbx0st68ZFbUzrvHy0Qv1eUR0g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fx4mDTp+ldWq9LB42XQqRImzmBkbwyrAjyWZAAUA275f+ISKzDW1qUGuwhfMlegGQojuPglgaBvdMG0XKHS4Wxkb+3mDv828NFQgstgvRK7bdOAc+lGnU+6ucTjRzKdt9TrZKn+Hv4wVeeMWnsMkgejde8GugLLJCJ99qL+OypwCK3LxiDUwzd4pWFOC9Fq/Yex9hE0HsGkYEy62nbcEEDWm3Sgg6VPewKkJv+pWNZSd834tBbJTXYsW9EE9/qq4nrW9KbTSX5ZsDhHdC71b1dbgccq+L684YQn+J+3VaQvHObkwvHShEUQRywa5xw77Ak/UDANL++Kianor1k7w6A==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 01 Nov 2021 11:11:11 +0000
  • Ironport-data: A9a23:68w8layRa7BTRiFFOep6t+fTwSrEfRIJ4+MujC+fZmUNrF6WrkVUx 2sXWWjTP6yLNGXxe4olO9izoU0H7J6GmNE2SlNoqyAxQypGp/SeCIXCJC8cHc8zwu4v7q5Dx 59DAjUVBJlsFhcwnvopW1TYhSEUOZugH9IQM8aZfHAuLeNYYH1500s6wrZg2tcAbeWRWGthh /uj+6UzB3f9s9JEGjp8B3Wr8U4HUFza4Vv0j3RmDRx5lAa2e0o9VfrzEZqZPXrgKrS4K8bhL wr1IBNVyUuCl/slIovNfr8W6STmSJaKVeSFoiI+t6RPHnGuD8H9u0o2HKN0VKtZt9mGt9F7j 9YQj8KVcAkKEe7shKclciJ9IxgraMWq+JefSZS+mcmazkmAeHrw2fR+SkoxOOX0+M4uXzsIr 6ZBbmlQMFbT3Ipaw5riIgVoru0lINPmI8U0vXZ4wCuCJf0nXYrCU+PB4towMDIY254RQ6eCP JJxhTxHMRjEehlhGVcuBpcDnbyhvV/RVD5+twfAzUYwyzeKl1EguFT3C/LUZd6iVchThlyfp G/N4yL+GB5yHMySz3+J/2yhgsfLnDjnQ8QCGbug7PlojVaPgGsJB3U+d3G2vP24gU6WQM9EJ gof/S9GkEQp3BX1FJ+nBUT++SPa+E5HMzZNLwEkwACHjamJ2Vi8P2YnYX1hdJ8elfQTdQV/g zdlgOjVLTBotbSUT1eU+bGVsS6+NEApEIMSWcMXZVBbuoe++enfmjqKF48+S/Dt0rUZDBmpm 2jSxBXSkYn/miLiO0+T2VncywyhqZHSJuLezlWGBzn1hu+ViWPMWmBJ1bQ5xasYRGp6ZgPY1 JThpyR5xLtWZaxhbATXHI0w8EiBvp5pygH0j191BIUG/D+w4XOldo04yGggfxo2aptZI2+4P B67VeZtCHl7ZiHCgUhfONrZNijX5fK4SYSNug78N4ImjmdNmP+vo3g1OB/4M5HFm0kwi6AvU ap3gu73ZUv2/Z9PlWLsL89EiOdD7nlnmQv7GMCqpzz6gOH2TCPEFt843K6mM7lRAFWs+16Or b6y9qKiln1ibQEJSnCMrNNIcgxSdRDWx/ne8qRqSwJKGSI/cEkJAP7N27IxPYtjmqVejODT+ X+hHERfzTLCabfvcG1ms1hvN+HiW4hRt3U+MXB+NFqkwSF7M42u8L0eZ908erx+rL5vyvt9T v8kfcScA6sQFmSbqmpFNZSt/pZ/cBmLhB6VO3b3ajYIYJM9FRfC/cXpf1Wz+XBWXDa3r8Y3v 5apyhjfHcgYXw1nAcuPMKCvwlq9sGIzguV3W0eUcNBfdF+1qNphKjDrj+9xKMYJcE2Ryjyf3 geQIBEZueiS/NNlrIiX3fiJ9t77HfF/E0xWG3jgwYy3bSSKrHC+xYJgUfqTeWyPXm3D56j/N /5eyOvxMaNbkQ8S4ZZ8Cbti0Yk3+8Dr++1B1g1hEXjGMwarB7dnLiXU1MVDrPQQlLpQuA/wU UOT4NhKf76OPZq9QlIWIQMkaMWF1O0VxWaOvahkfh2i6X8l5qeDXGVTIwKI2X5UI7ZCOY84x fss5ZwN4Aulhxt2atuLg0i4LYhXwqDsh0n/iqwnPQ==
  • Ironport-hdrordr: A9a23:z2AxbK3gegMXmEffgI55uwqjBLwkLtp133Aq2lEZdPU1SKClfq WV98jzuiWatN98Yh8dcLK7WJVoMEm8yXcd2+B4V9qftWLdyQiVxe9ZnO7f6gylNyri9vNMkY dMGpIObOEY1GIK7/rH3A==
  • Ironport-sdr: nQ4ShJkNYWf1JXsB26VfeZH5ry4xG3zrEmgmPSWP0LeWDHyzkQLjJK/Q/ockAuY5yjmx1JgupD pCxo1g3fubZ3nGGtnYjxVjsrUIFc1BM4YLd0GuS2s6/l04DNLmsV4uyQP3/6febYlVgQBaq8f9 XEhOzd/NwU8Jym4YGnZ7tW9t7aYMi8Nr2ArIlufW/GhHYGCpXUSaKTafkbsUjTv/3BPbJXHGx+ /UMq8DFsFS60nMFYZukLzBLYkUnxilj9Xkae6MwtKUPsANtHWCo87j3o0rAbX2fIGwnB+yjs2L X1Y2vR6hD+/IXFPVbuHK6m4D
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01/11/2021 10:53, Ian Jackson wrote:
> Andrew Cooper writes ("[PATCH] x86/kexec: Fix crash on transition to a 32bit 
> kernel on AMD hardware"):
>> The `ljmp *mem` instruction is (famously?) not binary compatible between 
>> Intel
>> and AMD CPUS.  The AMD-compatible version would require .long to be .quad in
>> the second hunk.
>>
>> Switch to using lretq, which is compatible between Intel and AMD, as well as
>> being less logic overall.
>>
>> Fixes: 5a82d5cf352d ("kexec: extend hypercall with improved load/unload ops")
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> CC: Wei Liu <wl@xxxxxxx>
>> CC: Ian Jackson <iwj@xxxxxxxxxxxxxx>
>>
>> For 4.16.  This is a bugfix for rare (so rare it has probably never been
>> exercised) but plain-broken usecase.
>>
>> One argument against taking it says that this has been broken for 8 years
>> already, so what's a few extra weeks.  Another is that this patch is only
>> compile tested because I don't have a suitable setup to repro, nor the time 
>> to
>> try organising one.
> Thanks for being frank about testing.
>
> The bug is a ?race? ?  Which hardly ever happens ?  Or it only affects
> some strange configurations ?  Or ... ?

Strange configuration.

On AMD hardware, if you try to use a 32bit crash kernel, then Xen will
unconditionally crash when trying to transition to it.

Any other scenario (Intel hardware, or a 64bit crash kernel) will work
fine and without incident.

>> On the other hand, I specifically used the point of binary incompatibility to
>> persuade Intel to drop Call Gates out of the architecture in the forthcoming
>> FRED spec.
> I'm afraid I can't make head or tail of this.  What are the
> implications ?

I managed to get some CPU architects to agree that there was a binary
incompatibility here.

>> The lretq pattern used here matches x86_32_switch() in
>> xen/arch/x86/boot/head.S, and this codepath is executed on every MB2+EFI
>> xen.gz boot, which from XenServer alone is a very wide set of testing.
> AIUI this is an argument saying that the basic principle of this
> change is good.  Good.
>
> However: is there some risk of a non-catastrophic breakage here, for
> example, if there was a slip in the actual implementation ?
> (Catastrophic breakage would break all our tests, I think.)

This path is only taken for a 32bit crash kernel.  It is not taken for
64bit crash kernels, or they wouldn't work on AMD either, and this is
something we test routinely in XenServer.

The worst that can happen is that I've messed the lretq pattern up, and
broken transition to all 32bit crash kernels, irrespective of hardware
vendor.

It will either function correctly, or explode.  If it is broken, it
won't be subtle, or dependent on the phase of the moon/etc.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.