[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: blocking vs. non-blocking mmu notifiers



[Let me add more people to the CC list - I am not really sure who is the
 most familiar with all the tricks that mmu notifiers might do]

On Wed 23-03-22 09:43:59, Juergen Gross wrote:
> Hi,
> 
> during analysis of a customer's problem on a 4.12 based kernel
> (deadlock due to a blocking mmu notifier in a Xen driver) I came
> across upstream patches 93065ac753e4 ("mm, oom: distinguish
> blockable mode for mmu notifiers") et al.
> 
> The backtrace of the blocked tasks was typically something like:
> 
>  #0 [ffffc9004222f228] __schedule at ffffffff817223e2
>  #1 [ffffc9004222f2b8] schedule at ffffffff81722a02
>  #2 [ffffc9004222f2c8] schedule_preempt_disabled at ffffffff81722d0a
>  #3 [ffffc9004222f2d0] __mutex_lock at ffffffff81724104
>  #4 [ffffc9004222f360] mn_invl_range_start at ffffffffc01fd398 [xen_gntdev]
>  #5 [ffffc9004222f398] __mmu_notifier_invalidate_page at ffffffff8123375a
>  #6 [ffffc9004222f3c0] try_to_unmap_one at ffffffff812112cb
>  #7 [ffffc9004222f478] rmap_walk_file at ffffffff812105cd
>  #8 [ffffc9004222f4d0] try_to_unmap at ffffffff81212450
>  #9 [ffffc9004222f508] shrink_page_list at ffffffff811e0755
> #10 [ffffc9004222f5c8] shrink_inactive_list at ffffffff811e13cf
> #11 [ffffc9004222f6a8] shrink_node_memcg at ffffffff811e241f
> #12 [ffffc9004222f790] shrink_node at ffffffff811e29c5
> #13 [ffffc9004222f808] do_try_to_free_pages at ffffffff811e2ee1
> #14 [ffffc9004222f868] try_to_free_pages at ffffffff811e3248
> #15 [ffffc9004222f8e8] __alloc_pages_slowpath at ffffffff81262c37
> #16 [ffffc9004222f9f0] __alloc_pages_nodemask at ffffffff8121afc1
> #17 [ffffc9004222fa48] alloc_pages_current at ffffffff8122f350
> #18 [ffffc9004222fa78] __get_free_pages at ffffffff8121685a
> #19 [ffffc9004222fa80] __pollwait at ffffffff8127e795
> #20 [ffffc9004222faa8] evtchn_poll at ffffffffc00e802b [xen_evtchn]
> #21 [ffffc9004222fab8] do_sys_poll at ffffffff8127f953
> #22 [ffffc9004222fec8] sys_ppoll at ffffffff81280478
> #23 [ffffc9004222ff30] do_syscall_64 at ffffffff81004954
> #24 [ffffc9004222ff50] entry_SYSCALL_64_after_hwframe at ffffffff818000b6
> 
> It was found that the notifier of the Xen gntdev driver was using a
> mutex resulting in the deadlock.
> 
> Michal Hocko suggested that backporting above mentioned patch might
> help, as the mmu notifier call is happening in atomic context.
> 
> Looking into that I was wondering whether try_to_unmap_one() shouldn't
> call mmu_notifier_invalidate_range_start_nonblock() instead of
> mmu_notifier_invalidate_range_start() if this is true. Otherwise I
> can't see how this deadlock could be avoided.

Just to be more explicit. The current upstream code calls
mmu_notifier_invalidate_range while the page table locks are held.
Are there any notifiers which could sleep in those? In other words
should we use the nonblock start/stop in try_to_unmap?

> Any thoughts?
> 
> 
> Juergen






-- 
Michal Hocko
SUSE Labs



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.