[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/mm: re-implement get_page_light() using an atomic increment


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 4 Mar 2024 09:54:47 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 04 Mar 2024 08:54:51 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.03.2024 09:50, Roger Pau Monné wrote:
> On Mon, Mar 04, 2024 at 08:54:34AM +0100, Jan Beulich wrote:
>> On 01.03.2024 13:42, Roger Pau Monne wrote:
>>> The current usage of a cmpxchg loop to increase the value of page count is 
>>> not
>>> optimal on amd64, as there's already an instruction to do an atomic add to a
>>> 64bit integer.
>>>
>>> Switch the code in get_page_light() to use an atomic increment, as that 
>>> avoids
>>> a loop construct.  This slightly changes the order of the checks, as current
>>> code will crash before modifying the page count_info if the conditions are 
>>> not
>>> correct, while with the proposed change the crash will happen immediately
>>> after having carried the counter increase.  Since we are crashing anyway, I
>>> don't believe the re-ordering to have any meaningful impact.
>>
>> While I consider this argument fine for ...
>>
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -2580,16 +2580,10 @@ bool get_page(struct page_info *page, const struct 
>>> domain *domain)
>>>   */
>>>  static void get_page_light(struct page_info *page)
>>>  {
>>> -    unsigned long x, nx, y = page->count_info;
>>> +    unsigned long old_pgc = arch_fetch_and_add(&page->count_info, 1);
>>>  
>>> -    do {
>>> -        x  = y;
>>> -        nx = x + 1;
>>> -        BUG_ON(!(x & PGC_count_mask)); /* Not allocated? */
>>
>> ... this check, I'm afraid ...
>>
>>> -        BUG_ON(!(nx & PGC_count_mask)); /* Overflow? */
>>
>> ... this is a problem unless we discount the possibility of an overflow
>> happening in practice: If an overflow was detected only after the fact,
>> there would be a window in time where privilege escalation was still
>> possible from another CPU. IOW at the very least the description will
>> need extending further. Personally I wouldn't chance it and leave this
>> as a loop.
> 
> So you are worried because this could potentially turn a DoS into an
> information leak during the brief period of time where the page
> counter has overflowed into the PGC state.
> 
> My understating is the BUG_ON() was a mere protection against bad code
> that could mess with the counter, but that the counter overflowing is
> not a real issue during normal operation.

With the present counter width it should be a merely theoretical concern.
I didn't do the older calculation again though taking LA57 into account,
so I'm not sure we're not moving onto thinner and thinner ice as hardware
(and our support for it) advances. As to "mere protection" - see how the
less wide counter was an active issue on 32-bit Xen, back at the time.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.