[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 5/9] mm/huge_memory: mark PMD mappings of the huge zero folio special
On Fri, Jul 18, 2025 at 12:54:38PM +0200, David Hildenbrand wrote: > On 18.07.25 12:41, Lorenzo Stoakes wrote: > > On Thu, Jul 17, 2025 at 10:31:28PM +0200, David Hildenbrand wrote: > > > On 17.07.25 20:29, Lorenzo Stoakes wrote: > > > > On Thu, Jul 17, 2025 at 01:52:08PM +0200, David Hildenbrand wrote: > > > > > The huge zero folio is refcounted (+mapcounted -- is that a word?) > > > > > differently than "normal" folios, similarly (but different) to the > > > > > ordinary > > > > > shared zeropage. > > > > > > > > Yeah, I sort of wonder if we shouldn't just _not_ do any of that with > > > > zero > > > > pages? > > > > > > I wish we could get rid of the weird refcounting of the huge zero folio > > > and > > > get rid of the shrinker. But as long as the shrinker exists, I'm afraid > > > that > > > weird per-process refcounting must stay. > > > > Does this change of yours cause any issue with it? I mean now nothing can > > grab > > this page using vm_normal_page_pmd(), so it won't be able to manipulate > > refcounts. > > Please look again at vm_normal_page_pmd(): we have a manual huge_zero_pfn() > check in there! There is no change in behavior. :) > > It's not obvious from the diff below. But huge zero folio was considered > special before this change, just not marked accordingly. Yeah I think the delta screwed me up here. Previously: struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t pmd) { ... if (unlikely(pmd_special(pmd))) return NULL; ... if (is_huge_zero_pmd(pmd)) return NULL; ... } Now: struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t pmd) { ... if (unlikely(pmd_special(pmd))) { ... if (is_huge_zero_pfn(pfn)) return NULL; ... } return vm_normal_page_pfn(vma, addr, pfn, pmd_val(pmd)); } And: static inline struct page *vm_normal_page_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, unsigned long long entry) { ... if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) { ... if (is_zero_pfn(pfn) || is_huge_zero_pfn(pfn)) return NULL; } ... } So it is equivalent, thanks, satisfied with that now! Sorry for being a pain, just keen to really be confident in this. > > > > > > > > > > > > > > > > > > For this reason, we special-case these pages in > > > > > vm_normal_page*/vm_normal_folio*, and only allow selected callers to > > > > > still use them (e.g., GUP can still take a reference on them). > > > > > > > > > > vm_normal_page_pmd() already filters out the huge zero folio. However, > > > > > so far we are not marking it as special like we do with the ordinary > > > > > shared zeropage. Let's mark it as special, so we can further refactor > > > > > vm_normal_page_pmd() and vm_normal_page(). > > > > > > > > > > While at it, update the doc regarding the shared zero folios. > > > > > > > > Hmm I wonder how this will interact with the static PMD series at [0]? > > > > > > No, it shouldn't. > > > > I'm always nervous about these kinds of things :) > > > > I'm assuming the reference/map counting will still work properly with the > > static > > page? > > Let me stress again: no change in behavior besides setting the special flag > in this patch. Return value of vm_normal_page_pmd() is not changed. OK yeah I think this is the issue here then - I had assumed that it did _somehow_ modify behaviour. See above, based on your responses I'v satisfied myself it's all good thank you! > > > > > > > > > Also, that series was (though I reviewed against it) moving stuff that > > > > references the huge zero folio out of there, but also generally allows > > > > access and mapping of this folio via largest_zero_folio() so not only > > > > via > > > > insert_pmd(). > > > > > > > > So we're going to end up with mappings of this that are not marked > > > > special > > > > that are potentially going to have refcount/mapcount manipulation that > > > > contradict what you're doing here perhaps? > > > > > > I don't think so. It's just like having the existing static (small) shared > > > zeropage where the same rules about refcounting+mapcounting apply. > > > > It feels like all this is a mess... am I missing something that would make > > it > > all make sense? > > Let me clarify: > > The small zeropage is never refcounted+mapcounted when mapped into page > tables. > > The huge zero folio is never refcounted+mapcounted when mapped into page > tables EXCEPT it is refcounted in a weird different when first mapped into a > process. Thanks that's great! > > The whole reason is the shrinker. I don't like it, but there was a reason it > was added. Maybe that reason no longer exists, but that's nothing that this > patch series is concerned with, really. :) I hate that so much but yes out of scope. > > -- > Cheers, > > David / dhildenb > Cheers, Lorenzo
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |