Xen project Mailing List

Re: xen-unstable linux-5.14: 1 of 2 multicall(s) failed: cpu 0

From: Juergen Gross <jgross@xxxxxxxx>

Date: Tue, 7 Sep 2021 10:15:58 +0200

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Sander Eikelenboom <linux@xxxxxxxxxxxxxx>

Delivery-date: Tue, 07 Sep 2021 08:16:04 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 07.09.21 10:11, Jan Beulich wrote:

On 07.09.2021 09:58, Juergen Gross wrote:

On 06.09.21 23:35, Sander Eikelenboom wrote:

L.S.,

On my AMD box running:
      xen-unstable changeset: Fri Sep 3 15:10:43 2021 +0200 git:2d4978ead4
      linux kernel: 5.14.1

With this setup I'm encountering some issues in dom0, see below.

--
Sander

xl dmesg gives:

(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 63b936 already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6a0622 already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6b63da already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 638dd9 already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 68a7bc already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 63c27d already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6a04f2 already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 690d49 already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6959a0 already pinned
(XEN) [2021-09-06 18:15:04.089] mm.c:3506:d0v0 mfn 6a055e already pinned
(XEN) [2021-09-06 18:15:04.090] mm.c:3506:d0v0 mfn 639437 already pinned


dmesg gives:

[34321.304270] ------------[ cut here ]------------
[34321.304277] WARNING: CPU: 0 PID: 23628 at
arch/x86/xen/multicalls.c:102 xen_mc_flush+0x176/0x1a0
[34321.304288] Modules linked in:
[34321.304291] CPU: 0 PID: 23628 Comm: apt-get Not tainted
5.14.1-20210906-doflr-mac80211debug+ #1
[34321.304294] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS
V1.8B1 09/13/2010
[34321.304296] RIP: e030:xen_mc_flush+0x176/0x1a0
[34321.304300] Code: 89 45 18 48 c1 e9 3f 48 89 ce e9 20 ff ff ff e8 60
03 00 00 66 90 5b 5d 41 5c 41 5d c3 48 c7 45 18 ea ff ff ff be 01 00 00
00 <0f> 0b 8b 55 00 48 c7 c7 10 97 aa 82 31 db 49 c7 c5 38 97 aa 82 65
[34321.304303] RSP: e02b:ffffc90000a97c90 EFLAGS: 00010002
[34321.304305] RAX: ffff88807d416398 RBX: ffff88807d416350 RCX:
ffff88807d416398
[34321.304306] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
deadbeefdeadf00d
[34321.304308] RBP: ffff88807d416300 R08: aaaaaaaaaaaaaaaa R09:
ffff888006160cc0
[34321.304309] R10: deadbeefdeadf00d R11: ffffea000026a600 R12:
0000000000000000
[34321.304310] R13: ffff888012f6b000 R14: 0000000012f6b000 R15:
0000000000000001
[34321.304320] FS:  00007f5071177800(0000) GS:ffff88807d400000(0000)
knlGS:0000000000000000
[34321.304322] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[34321.304323] CR2: 00007f506f542000 CR3: 00000000160cc000 CR4:
0000000000000660
[34321.304326] Call Trace:
[34321.304331]  xen_alloc_pte+0x294/0x320
[34321.304334]  move_pgt_entry+0x165/0x4b0
[34321.304339]  move_page_tables+0x6fa/0x8d0
[34321.304342]  move_vma.isra.44+0x138/0x500
[34321.304345]  __x64_sys_mremap+0x296/0x410
[34321.304348]  do_syscall_64+0x3a/0x80
[34321.304352]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[34321.304355] RIP: 0033:0x7f507196301a


I can see why this failure is occurring, but I'm not sure which way is
the best to fix it.

The problem is that a pinned page table is moved: the pmd entry
referencing it is cleared and a new reference is put into the pmd.
This is done by getting the old pmd entry, clearing that entry, and then
using pmd_populate() to write the new pmd entry. pmd_populate() will
lead to a call of xen_pte_alloc() trying to pin the referenced page
table, which is failing, as it is already pinned.

The problem has been introduced by commit 0881ace292b662d2 in kernel
5.14.

Following solutions would be possible:

1. When running as PV guest skip the optimization of move_pgt_entry()
     by letting arch_supports_page_table_move() return false. This will
     result in a performance drop in some cases.

2. Unpin the page table before calling pmd_populate(). This adds some
     unneeded hypercall and without flushing the TLB I'm feeling uneasy
     to do that.


I agree as far as the "unneeded hypercall" aspect goes, but I don't
see any connection to the TLB (or a need to flush it): Pinning has
nothing to do with insertion into a live page table; a pinned page
table can be entirely free floating. It's the removal from a
(possibly) live page table which would require a flush.

And this removal is happening: /* Clear the pmd */ pmd = *old_pmd; pmd_clear(old_pmd); VM_BUG_ON(!pmd_none(*new_pmd)); pmd_populate(mm, new_pmd, pmd_pgtable(pmd)); So unpinning after calling pmd_clear() seems to be risky.

3. Add a check in xen_pte_alloc() if the page table is pinned already
     and if this is the case, don't do the pinning. This is a rather clean
     solution, but will result in other failures if a page table is used
     multiple times (this case would be caught today as in the failure
     above).

My tendency is towards solution 3 as it is local to Xen code and has the
best performance.


I agree 3 looks most promising. I can't judge how big of a risk
there is for a page table to get used in more than one place, and
hence how important it is to be able to detect that case.

Thanks. I'm going that route then. Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.