[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 14/21] x86: introduce helper for recording degree of contiguity in page tables


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 20 May 2022 12:59:55 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fBPen8ehI1NWnb+H18VZAfUdshQa698HDFzQAHBLScM=; b=WZAr3qvccuzT8WrkCN/iRoTtXxcC9oGg+qduZ/U3X/BvITaMY1sf4w3RcVODvEtzq6MO00ovFVk9sEpZ85lH3bFRbeawb5yBY8qSiK7Dr5sQiZQSLnpog2f1dbXUVGwCOJAIWebUuZE4bLO3bcbLmHQHH1646/SgwF4NSDKvW0F+Wqc7UKYJ9DDVi5oGIqKFj/9yqqU09vrD/JaH8yjkuEMVKqGfdGh2g9QZftfB0rmsu7bwLo82t4vk5Ns+vfYdTTc0xo+kLG9DjcEx7Z6ioI2kt9j8L7LcVjFK2ZHuRxOsCz1Zczei1LK6a6zSux/vQSAzL+CxW3q7HI9UzNn8Mg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kYcAUJOHRD+vE39qysGMJufUqohn1KgPzFLGmYk47Kwn7LyF4vfRl2iveMthjonFHycUmfPv6PXBCF6oCp++ADwmAJ9RHcbW/EJIAxqThhBPdOrtyrS+U5kbp3+P++Ksc46nwAiGAA8X6wFk/8oCxCCVMWgb4PvvC3mxncEI8FdP9hvWZF39hFvAqTIjlLNS02D8AIc82BllMp34wWiizgljPAdbwvrKvRfl1f6NJ4CLi9KzVF35To33QCJR3Xvux0jEeczbRllynjhBnF9Ict9jzzDH2gJJpbd8UYO9oYx6RQQUvUPjUiMtTRBA2PHw0kUp+ELKb63BUvl5F8dQiw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 20 May 2022 11:00:11 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 20.05.2022 12:22, Roger Pau Monné wrote:
> On Wed, May 18, 2022 at 12:06:29PM +0200, Jan Beulich wrote:
>> On 06.05.2022 15:25, Roger Pau Monné wrote:
>>> On Mon, Apr 25, 2022 at 10:41:23AM +0200, Jan Beulich wrote:
>>>> --- /dev/null
>>>> +++ b/xen/arch/x86/include/asm/pt-contig-markers.h
>>>> @@ -0,0 +1,105 @@
>>>> +#ifndef __ASM_X86_PT_CONTIG_MARKERS_H
>>>> +#define __ASM_X86_PT_CONTIG_MARKERS_H
>>>> +
>>>> +/*
>>>> + * Short of having function templates in C, the function defined below is
>>>> + * intended to be used by multiple parties interested in recording the
>>>> + * degree of contiguity in mappings by a single page table.
>>>> + *
>>>> + * Scheme: Every entry records the order of contiguous successive entries,
>>>> + * up to the maximum order covered by that entry (which is the number of
>>>> + * clear low bits in its index, with entry 0 being the exception using
>>>> + * the base-2 logarithm of the number of entries in a single page table).
>>>> + * While a few entries need touching upon update, knowing whether the
>>>> + * table is fully contiguous (and can hence be replaced by a higher level
>>>> + * leaf entry) is then possible by simply looking at entry 0's marker.
>>>> + *
>>>> + * Prereqs:
>>>> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
>>>> + *   contiguous bits (ignored by hardware), before including this file,
>>>> + * - page tables to be passed here need to be initialized with correct
>>>> + *   markers.
>>>
>>> Not sure it's very relevant, but might we worth adding that:
>>>
>>> - Null entries must have the PTE zeroed except for the CONTIG_MASK
>>>   region in order to be considered as inactive.
>>
>> NP, I've added an item along these lines.
>>
>>>> +static bool pt_update_contig_markers(uint64_t *pt, unsigned int idx,
>>>> +                                     unsigned int level, enum PTE_kind 
>>>> kind)
>>>> +{
>>>> +    unsigned int b, i = idx;
>>>> +    unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
>>>> +
>>>> +    ASSERT(idx < CONTIG_NR);
>>>> +    ASSERT(!(pt[idx] & CONTIG_MASK));
>>>> +
>>>> +    /* Step 1: Reduce markers in lower numbered entries. */
>>>> +    while ( i )
>>>> +    {
>>>> +        b = find_first_set_bit(i);
>>>> +        i &= ~(1U << b);
>>>> +        if ( GET_MARKER(pt[i]) > b )
>>>> +            SET_MARKER(pt[i], b);
>>>
>>> Can't you exit early when you find an entry that already has the
>>> to-be-set contiguous marker <= b, as lower numbered entries will then
>>> also be <= b'?
>>>
>>> Ie:
>>>
>>> if ( GET_MARKER(pt[i]) <= b )
>>>     break;
>>> else
>>>     SET_MARKER(pt[i], b);
>>
>> Almost - I think it would need to be 
>>
>>         if ( GET_MARKER(pt[i]) < b )
>>             break;
>>         if ( GET_MARKER(pt[i]) > b )
>>             SET_MARKER(pt[i], b);
> 
> I guess I'm slightly confused, but if marker at i is <= b, then all
> following markers will also be <=, and hence could be skipped?

Your use of "following" is ambiguous here, because the iteration
moves downwards as far as PTEs inspected are concerned (and it's
b which grows from one iteration to the next). But yes, I think I
agree now that ...

> Not sure why we need to keep iterating if GET_MARKER(pt[i]) == b.

... this isn't needed. At which point ...

> FWIW, you could even do:
> 
> if ( GET_MARKER(pt[i]) <= b )
>     break;
> SET_MARKER(pt[i], b);
> 
> Which would keep the conditionals to 1 like it currently is.
> 
>>
>> or, accepting redundant updates, 
>>
>>         if ( GET_MARKER(pt[i]) < b )
>>             break;
>>         SET_MARKER(pt[i], b);
>>
>> . Neither the redundant updates nor the extra (easily mis-predicted)
>> conditional looked very appealing to me, but I guess I could change
>> this if you are convinced that's better than continuing a loop with
>> at most 9 (typically less) iterations.
> 
> Well, I think I at least partly understood the logic.  Not sure
> whether it's worth adding the conditional or just assuming that
> continuing the loop is going to be cheaper.  Might be worth adding a
> comment that we choose to explicitly not add an extra conditional to
> check for early exit, because we assume that to be more expensive than
> just continuing.

... this resolves without further action.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.