[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/livepatch: Fix livepatch application when CET is active


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 17 Apr 2023 15:41:48 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=q8EAtqB0DW9e9+Dy70SloebbvPxYikdcuoygh95MPUI=; b=bFyp+lvh1j0z4BnTkz+CCmbNhw65vrHw50oDL4992EFlZCN4k3yt5rdPpipHDNMWVxzJ01zdUCf/gUHZ3lho/xJIm7MptQ+MqAWwqhwL/jberPA01po03SsFbK1EUl2F4Pqzgj535yqgNDfP/qS2Orfkw/gncdY8Qis8JoZghjIO0VSE5czvBnp6pnExcKob9fhqVJ+/Xa4KQEAj489WTA6SCrTvm67rciYQ3UHLdIwzbkNO3stAzUrTydIFfrj8a4Mr5vijmjrUzTxV9Fg/gPLHz9t4LNCqmMK9r238PZmesnj/4sDVUlB15zC8UXqBjDzOzvmQWmAMh82fzHJiYQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MT91D2zWamebBEEv2k4CEYzADAy8JVyyfP8Kh0A9FZkRgu4ihAu80fs7rHyUxdL3CiNsfqJdgy73gvQCI0NQ5FSQRHtE/85Y8dN3O6YkDv7rUdCNKn8Lh3k+hGE03KvxfsPRCJ8CF1nLmWQUwbBrg0wdGG8Gh4Dgi9v9WACjvIipYdIFvpyxyqw8sQBPKp8jbgbaC7NaQ2EHMZoXyn4tHO/89AtXqaZGT4V6ES8PFKgtwJOldpjEG6FFcQOCZ/trpJX4wKRz+Fiej3kbqB1J7rfRTS3k1iiMuUuuJaGr4unDKGC3UU603WREbnqlVGbV3aDpU8raKiME+QqbpurRzw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 17 Apr 2023 14:42:28 +0000
  • Ironport-data: A9a23:Y7t1ZKmk3W4Tufy1hoMVgeXo5gxEJ0RdPkR7XQ2eYbSJt1+Wr1Gzt xJLW2iGPv3bMGb9fIgiYdu2/UNQ68XVndVkSAQ/+31hEyMWpZLJC+rCIxarNUt+DCFhoGFPt JxCN4aafKjYaleG+39B55C49SEUOZmgH+a6U6icfHgqH2eIcQ954Tp7gek1n4V0ttawBgKJq LvartbWfVSowFaYCEpNg064gE4p7aWaVA8w5ARkPqgX5QCGzhH5MbpETU2PByqgKmVrNrbSq 9brlNmR4m7f9hExPdKp+p6TnpoiG+O60aCm0xK6aoD66vRwjnVaPpUTbZLwXXx/mTSR9+2d/ f0W3XCGpaXFCYWX8AgVe0Ew/yiTpsSq8pefSZS0mZT7I0Er7xIAahihZa07FdRwxwp5PY1B3 bsTIxsyQkzdvPL1nqCgWvdDmO0pHda+aevzulk4pd3YJdAPZMmbBonvu5pf1jp2gd1SF/HDY cZfcSBocBnLfxxIPBEQFY46m+CrwHL4dlW0qnrM/fZxvzeVkVw3ieCxWDbWUoXiqcF9t0CUv G/ZuU/+BQkXLoe3wjuZ6HO8wOTImEsXXapLTOHlp6873AL7Kmo7LxkVage4rv2DrE+PZe1UN RMGwgcCsv1nnKCsZpynN/Gim1aGtBMBX9tbE8Uh9RqAjKHT5m6xGWwsXjNHLts8u6ceRjssz FaF2czoAT9Ht6ecQnaQsLyTqFuaKSUTaGMPeyIAZQ8E+MX45pE+iArVSdRuG7Lzicf6cQwc2 BiPpSk6wrkW08gC0vzj+Uid2mrw4J/UUgQy+wPbGHq/6R90b5KkYIru7kXH6fFHL8CSSVzpU GU4pvVyJdsmVfml/BFhis1XdF11z55p6AHhvGM=
  • Ironport-hdrordr: A9a23:ELHnRqO1ZBYdvsBcTsijsMiBIKoaSvp037BL7TETdfUxSKalfq +V8cjzuSWZtN9zYhEdcLK7VpVoKEm0nfVICOEqTNKftWLd2VdAQrsM0WOoqweQfxEXnIRmpM VdT5Q=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17/04/2023 2:59 pm, Jan Beulich wrote:
> On 17.04.2023 15:52, Andrew Cooper wrote:
>> Right now, trying to apply a livepatch on any system with CET shstk (AMD Zen3
>> or later, Intel Tiger Lake or Sapphire Rapids and later) fails as follows:
>>
>>   (XEN) livepatch: lp: Verifying enabled expectations for all functions
>>   (XEN) common/livepatch.c:1591: livepatch: lp: timeout is 30000000ns
>>   (XEN) common/livepatch.c:1703: livepatch: lp: CPU28 - IPIing the other 127 
>> CPUs
>>   (XEN) livepatch: lp: Applying 1 functions
>>   (XEN) hi_func: Hi! (called 1 times)
>>   (XEN) Hook executing.
>>   (XEN) Assertion 'local_irq_is_enabled() || cpumask_subset(mask, 
>> cpumask_of(cpu))' failed at arch/x86/smp.c:265
>>   (XEN) *** DOUBLE FAULT ***
>>   <many double faults>
>>
>> The assertion failure is from a global (system wide) TLB flush initiated by
>> modify_xen_mappings().  I'm not entirely sure when this broke, and I'm not
>> sure exactly what causes the #DF's, but it doesn't really matter either
>> because they highlight a latent bug that I'd overlooked with the CET-SS vs
>> patching work the first place.
>>
>> While we're careful to arrange for the patching CPU to avoid encountering
>> non-shstk memory with transient shstk perms, other CPUs can pick these
>> mappings up too if they need to re-walk for uarch reasons.
>>
>> Another bug is that for livepatching, we only disable CET if shadow stacks 
>> are
>> in use.  Running on Intel CET systems when Xen is only using CET-IBT will
>> crash in arch_livepatch_quiesce() when trying to clear CR0.WP with CR4.CET
>> still active.
>>
>> Also, we never went and cleared the dirty bits on .rodata.  This would
>> matter (for the same reason it matters on .text - it becomes a valid target
>> for WRSS), but we never actually patch .rodata anyway.
>>
>> Therefore rework how we do patching for both alternatives and livepatches.
>>
>> Introduce modify_xen_mappings_lite() with a purpose similar to
>> modify_xen_mappings(), but stripped down to the bare minimum as it's used in
>> weird contexts.  Leave all complexity to the caller to handle.
>>
>> Instead of patching by clearing CR0.WP (and having to jump through some
>> fragile hoops to disable CET in order to do this), just transiently relax the
>> permissions on .text via l2_identmap[].
>>
>> Note that neither alternatives nor livepatching edit .rodata, so we don't 
>> need
>> to relax those permissions at this juncture.
>>
>> The perms are relaxed globally, but is safe enough.  Alternatives run before
>> we boot APs, and Livepatching runs in a quiesced state where the other CPUs
>> are not doing anything interesting.
>>
>> This approach is far more robust.
>>
>> Fixes: 48cdc15a424f ("x86/alternatives: Clear CR4.CET when clearing CR0.WP")
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Thanks.

>
> One further remark, though:
>
>> @@ -5879,6 +5880,73 @@ int destroy_xen_mappings(unsigned long s, unsigned 
>> long e)
>>      return modify_xen_mappings(s, e, _PAGE_NONE);
>>  }
>>  
>> +/*
>> + * Similar to modify_xen_mappings(), but used by the alternatives and
>> + * livepatch in weird contexts.  All synchronization, TLB flushing, etc is 
>> the
>> + * responsibility of the caller, and *MUST* not be introduced here.
>> + *
>> + * Must be limited to XEN_VIRT_{START,END}, i.e. over l2_xenmap[].
>> + * Must be called with present flags, and over present mappings.
>> + * Must be called on leaf page boundaries.
> This last sentence, while wording-wise correct, could do with making more
> explicit that it is the caller's responsibility to know whether large page
> mappings are in use, due to ...

The meaning here is really "this doesn't shatter superpages", and this
was the most concise I could come up with.

Would ", i.e. won't shatter 2M pages." as a clarification work?

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.