[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 4/7] x86: introduce x86_seg_sys


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 5 Sep 2024 14:16:54 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 05 Sep 2024 12:17:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.09.2024 18:54, Andrew Cooper wrote:
> On 04/09/2024 1:29 pm, Jan Beulich wrote:
>> To represent the USER-MSR bitmap access, a new segment type needs
>> introducing, behaving like x86_seg_none in terms of address treatment,
>> but behaving like a system segment for page walk purposes (implicit
>> supervisor-mode access).
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> This feels a little fragile: Of course I did look through uses of the
>> enumerators, and I didn't find further places which would need
>> adjustment, but I'm not really sure I didn't miss any place.
> 
> It does feel a bit fragile, but it may help to consider the other
> related cases.
> 
> Here, we need a linear access with implicit-supervisor paging
> properties.  From what I can tell, it needs to be exactly like other
> implicit supervisor accesses.

Well, not exactly. There's no segment (and hence no segment base)
involved here. Hence, as said in the description, it's a mix of two
things we've got so far.

> For CET, we get two new cases.
> 
> The legacy bitmap has a pointer out of MSR_[U,S]_CET, but otherwise
> obeys CPL rules, so wants to be x86_seg_none.
> 
> However, WRUSS is both a CPL0 instruction, and generates implicit-user
> accesses.  It's the first instruction of it's like, that I'm aware of. 

With MOVU having got ripped back out of the 386, yes. (Whether to call
such "implicit" user is a separate question.)

> If we're going down this x86_seg_sys route, we'd need x86_seg_user too.

That won't work, as we need to express the real x86_seg_[cdefgs]s
associated with the insn's memory operand. Whereas x86_seg_sys doesn't
need combining with anything.

> Really, this is a consequence of the memory APIs we've got.  It's the
> intermediate layers which generate PFEC_* for the pagewalk, and we're
> (ab)using segment at the top level to encode "skip segmentation but I
> still want certain properties".

Right, for USER-MSR. For WRUSS it's "do segmentation and I want two extra
properties" (just one for WRSS).

> But, there's actually a 3rd case we get from CET, and it breaks everything.
> 
> Shstk accesses are a new type, architecturally expressed as a new input
> (and output) to the pagewalk, but are also regular user-segment relative.

WR{,U}SS are part of that, aren't they?

> We either do the same trick of expressing fetch() in terms of
> read(PFEC_insn) and implement new shstk_{read,write}() accessors which
> wrap {read,write}(PFEC_shstk), or we need to plumb the PFEC parameters
> higher in the call tree.
> 
> It's worth noting that alignment restrictions make things even more
> complicated.  Generally, shstk accesses should be 8 or 4 byte aligned
> (based on osize), and the pseudocode for WR{U}SS calls this out; after
> all they're converting from arbitrary memory operands.
> 
> However, there's a fun corner case where a 64bit code segment can use
> INCSSPD to misalign SSP, then CALL to generate a misaligned store.  This
> combines with an erratum in Zen3 and possibly Zen4 where there's a
> missing #GP check on LRET and you can forge a return address formed of
> two misaligned addresses.

Well, we certainly don't need to emulate errata, I'd say.

> So misaligned stores are definitely possible (I checked this on both
> vendors at the time), so it wouldn't be appropriate to have in a general
> shstk_*() helper.  In turn, this means that the implementation of
> WR{U}SS would need a way to linearise it's operand manually to insert
> the additional check before then making a regular memory access.

We do such for SSE alignment checking already; see the emulator's
is_aligned(). I don't see why we couldn't re-use that for WR{,U}SS.

> And I can't see a way of doing this without exposing PFEC inputs at the
> top level.

Certainly we'll need a qualifier alongside x86_seg_[cdefgs]s, which of
course could then also be allowed to be combined with x86_seg_none.
Moving PFEC inputs to the top level, while certainly possible, would
involve a lot of churn. Plus I'm also hesitant to further grow the
hooks' numbers of parameters. IOW introducing new shstk_{read,write}()
hooks would look somewhat preferable to me, at least for the moment,
if we don't want to have a x86_seg_{shstk,user} flags that can be OR-ed
into the other x86_seg_*.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.