[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/3] mini-os: mm: switch need_pgt() to use walk_pt()



Jürgen Groß, le jeu. 01 août 2024 11:25:31 +0200, a ecrit:
> On 01.08.24 11:15, Samuel Thibault wrote:
> > Jürgen Groß, le jeu. 01 août 2024 10:13:07 +0200, a ecrit:
> > > On 01.08.24 09:39, Samuel Thibault wrote:
> > > > Jürgen Groß, le jeu. 01 août 2024 07:56:36 +0200, a ecrit:
> > > > > On 31.07.24 23:27, Samuel Thibault wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > Juergen Gross, le mer. 31 juil. 2024 15:00:25 +0200, a ecrit:
> > > > > > > -pgentry_t *need_pgt(unsigned long va)
> > > > > > > +static int need_pgt_func(unsigned long va, unsigned int lvl, 
> > > > > > > bool is_leaf,
> > > > > > > +                         pgentry_t *pte, void *par)
> > > > > > >     {
> > > > > > [...]
> > > > > > > +    if ( lvl == L1_FRAME || (*pte & _PAGE_PRESENT) )
> > > > > > 
> > > > > > Did you mean (*pte & _PAGE_PSE)?
> > > > > 
> > > > > No.
> > > > 
> > > > I don't understand: it doesn't map what I know of need_pgt and what the
> > > > existing code is doing.
> > > > 
> > > > AIKI, the point of need_pgt is to make sure there is a L1 page table
> > > > entry for a VA, and return it, so the caller can put in it at pte for a
> > > > mfn or such. In the case a PSE is met, we don't go further, and it's up
> > > > to the caller to decide what it wants to do (most often it's actually
> > > > unexpected and asserted out). In both cases, the PRESENT bit of the
> > > > pte whose address is returned does not matter, most often it's the
> > > > caller which will set it.
> > > > 
> > > > The existing code for need_pgt thus always adds page table entries down
> > > > to level1 (except if _PAGE_PSE is met, i.e. a large page was already set
> > > > up): the termination was:
> > > > 
> > > > [... walk down to level 2]
> > > 
> > > You have omitted:
> > > 
> > >         ASSERT(tab[offset] & _PAGE_PRESENT);
> > 
> > Right, that confirms that we never set PSE without PRESENT. We probably
> > want to keep that assertion in the PSE case.
> 
> Hmm, I'm not sure.
> 
> The hardware allows to have any bit set in the PTE when PRESENT isn't set.
> It will never look at any other bit in this case. Why should the software
> require something different here?

To document&check what we expect to be producing.

> > But we still want to continue walking down to level1 (or pse) even when
> > L4/L3/L2 have PRESENT.
> 
> We do that. In this case the is_leaf flag won't be set and need_pgt_func()
> will return early with the return value 0, meaning that the walk will be
> continued.

Ah, I missed that indeed, I had not integrated that is_leaf is 1 also in
the case of missing level>1 page. It'd probably be useful to emphasize
in patch 1:

+ * is_leaf: true if PTE doesn't address another page table

it should explicitly say (L1 PTE, or PSE, or not present yet)

The *pte & _PAGE_PRESENT test alone still looks surprising to the
reader, though: we don't return because the page is present, but because
we reached the last possible level that we could fill, which is either
L1, or an existing PSE. It is true that when is_leaf is 1, if it's not
L1 but present, it's necessarily a PSE, but it's clearer to the reader
to write what we meant: L1 or PSE, so something like:

if ( lvl == L1_FRAME || (*pte & _PAGE_PRESENT) ) {
    assert ( lvl == L1_FRAME || (*pte & _PAGE_PSE) ); // we are at last 
possible level
    [...]
}

Samuel



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.