[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Bug 198497] handle_mm_fault / xen_pmd_val / radix_tree_lookup_slot Null pointer

On 20/04/18 16:52, Jason Andryuk wrote:
> On Fri, Apr 20, 2018 at 11:42 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>> On 20.04.18 at 17:25, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 20/04/18 16:20, Jason Andryuk wrote:
>>>> Adding xen-devel and the Linux Xen maintainers.
>>>> Summary: Some Xen users (and maybe others) are hitting a BUG in
>>>> __radix_tree_lookup() under do_swap_page() - example backtrace is
>>>> provided at the end.  Matthew Wilcox provided a band-aid patch that
>>>> prints errors like the following instead of triggering the bug.
>>>> Skylake 32bit PAE Dom0:
>>>> Bad swp_entry: 80000000
>>>> mm/swap_state.c:683: bad pte d3a39f1c(8000000400000000)
>>>> Ivy Bridge 32bit PAE Dom0:
>>>> Bad swp_entry: 40000000
>>>> mm/swap_state.c:683: bad pte d3a05f1c(8000000200000000)
>>>> Other 32bit DomU:
>>>> Bad swp_entry: 4000000
>>>> mm/swap_state.c:683: bad pte e2187f30(8000000200000000)
>>>> Other 32bit:
>>>> Bad swp_entry: 2000000
>>>> mm/swap_state.c:683: bad pte ef3a3f38(8000000100000000)
>>>> The Linux bugzilla has more info
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=198497
>>>> This may not be exclusive to Xen Linux, but most of the reports are on
>>>> Xen.  Matthew wonders if Xen might be stepping on the upper bits of a
>>>> pte.
>>> Yes - Xen does use the upper bits of a PTE, but only 1 in release
>>> builds, and a second in debug builds.  I don't understand where you're
>>> getting the 3rd bit in there.
>> The former supposedly is _PAGE_GUEST_KERNEL, which we use for 64-bit
>> guests only. Above talk is of 32-bit guests only.
>> In addition both this and _PAGE_GNTTAB are used on present PTEs only,
>> while above talk is about swap entries.
> This hits a BUG going through do_swap_page, but it seems like users
> don't think they are actually using swap at the time.  One reporter
> didn't have any swap configured.  Some of this information was further
> down in my original message.
> I'm wondering if somehow we have a PTE that should be empty and should
> be lazily filled.  For some reason, the entry has some bits set and is
> causing the trouble.  Would Xen mess with the PTEs in that case?

Any PTE with the present bit clear will be accepted and used
unmodified.  That said, I believe there is some batching of updates for
efficiency reasons in the PVops layer of the kernel, which might end up
causing a disconnect between what the swap system things, and what the
actual PTEs show when read.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.