Xen project Mailing List

Re: [xen-unstable test] 164996: regressions - FAIL

To: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>

Date: Thu, 23 Sep 2021 11:24:14 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=a1fFr37HGDC+ZBiUGHJDEvBCp02MANs3D58TdZg3h5A=; b=Vt5S+4LDSC8blQ2Nij4cn1dwx6+9iGecjgsoLnnEY49tD/2JiENqv1HV3wTa3ZrwUv9gI6Xti+ff8ZtwfpDNoJeF1P1MLUjV69icwZ/Bv/Dlr0xy7lRZ7U00xT2dzEMR9OA/lpDDmzEbZEzp7wt12Cp3mIJ3fyDXAR0EFOMILVzQ8+HO+GZMdfuxWvY8okKr2s5JDAfliitSCd/mwzFMMAJj4V0cEQu2PdPOrOPYQubrunFokggifl2JidN9emP7Ob2oWZOyB2ozNo7BBv27/g1CYbdRKuMfya1VZneNFgvxLngMONAbs6HE2mxvd6hhkfsLglDT/sWSoRb6hntKfA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=McxchkuwyM6C9fw+16jxCNjI+2EyTT5cJN1HnH1+ztlxpE/DfVkPlveHl8FzMl69x1ZfAwoVsmvAHhw60WgF3Rf5iAh91qzP3QEzBKrdxveVnIvM3BWFogqSf+okYKq3mnDgn+u2j+DgziNq6VRPFE46EyUoth0nPZ0TBpCH9ShU8q4oLNNI4cBqE/J/EFlTZcOchXQY96r7w4HKaKjCydYn2SZipYo2Cn94Kn4YFJcOMRNC/FgC5nLLij731dN4V8E8bguLTfrDJs6M1Uz8MVCp1e6K9BYBr55LcVPlbE2QttAukl8xLuZ9FN4LpmUvD8WbyUuV54VjNfa1CvD02w==

Authentication-results: apertussolutions.com; dkim=none (message not signed) header.d=none;apertussolutions.com; dmarc=none action=none header.from=suse.com;

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, dpsmith@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 23 Sep 2021 09:24:27 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.09.2021 03:10, Stefano Stabellini wrote: > On Wed, 22 Sep 2021, Jan Beulich wrote: >> On 22.09.2021 01:38, Stefano Stabellini wrote: >>> On Mon, 20 Sep 2021, Ian Jackson wrote: >>>> Jan Beulich writes ("Re: [xen-unstable test] 164996: regressions - FAIL"): >>>>> As per >>>>> >>>>> Sep 15 14:44:55.502598 [ 1613.322585] Mem-Info: >>>>> Sep 15 14:44:55.502643 [ 1613.324918] active_anon:5639 >>>>> inactive_anon:15857 isolated_anon:0 >>>>> Sep 15 14:44:55.514480 [ 1613.324918] active_file:13286 >>>>> inactive_file:11182 isolated_file:0 >>>>> Sep 15 14:44:55.514545 [ 1613.324918] unevictable:0 dirty:30 writeback:0 >>>>> unstable:0 >>>>> Sep 15 14:44:55.526477 [ 1613.324918] slab_reclaimable:10922 >>>>> slab_unreclaimable:30234 >>>>> Sep 15 14:44:55.526540 [ 1613.324918] mapped:11277 shmem:10975 >>>>> pagetables:401 bounce:0 >>>>> Sep 15 14:44:55.538474 [ 1613.324918] free:8364 free_pcp:100 >>>>> free_cma:1650 >>>>> >>>>> the system doesn't look to really be out of memory; as per >>>>> >>>>> Sep 15 14:44:55.598538 [ 1613.419061] DMA32: 2788*4kB (UMEC) 890*8kB >>>>> (UMEC) 497*16kB (UMEC) 36*32kB (UMC) 1*64kB (C) 1*128kB (C) 9*256kB (C) >>>>> 7*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 33456kB >>>>> >>>>> there even look to be a number of higher order pages available (albeit >>>>> without digging I can't tell what "(C)" means). Nevertheless order-4 >>>>> allocations aren't really nice. >>>> >>>> The host history suggests this may possibly be related to a qemu update. >>>> >>>> http://logs.test-lab.xenproject.org/osstest/results/host/rochester0.html >> >> Stefano - as per some of your investigation detailed further down I >> wonder whether you had seen this part of Ian's reply. (Question of >> course then is how that qemu update had managed to get pushed.) >> >>>> The grub cfg has this: >>>> >>>> multiboot /xen placeholder conswitch=x watchdog noreboot async-show-all >>>> console=dtuart dom0_mem=512M,max:512M ucode=scan ${xen_rm_opts} >>>> >>>> It's not clear to me whether xen_rm_opts is "" or "no-real-mode edd=off". >>> >>> I definitely recommend to increase dom0 memory, especially as I guess >>> the box is going to have a significant amount, far more than 4GB. I >>> would set it to 2GB. Also the syntax on ARM is simpler, so it should be >>> just: dom0_mem=2G >> >> Ian - I guess that's an adjustment relatively easy to make? I wonder >> though whether we wouldn't want to address the underlying issue first. >> Presumably not, because the fix would likely take quite some time to >> propagate suitably. Yet if not, we will want to have some way of >> verifying that an eventual fix there would have helped here. >> >>> In addition, I also did some investigation just in case there is >>> actually a bug in the code and it is not a simple OOM problem. >> >> I think the actual issue is quite clear; what I'm struggling with is >> why we weren't hit by it earlier. >> >> As imo always, non-order-0 allocations (perhaps excluding the bringing >> up of the kernel or whichever entity) are to be avoided it at possible. >> The offender in this case looks to be privcmd's alloc_empty_pages(). >> For it to request through kcalloc() what ends up being an order-4 >> allocation, the original IOCTL_PRIVCMD_MMAPBATCH must specify a pretty >> large chunk of guest memory to get mapped. Which may in turn be >> questionable, but I'm afraid I don't have the time to try to drill >> down where that request is coming from and whether that also wouldn't >> better be split up. >> >> The solution looks simple enough - convert from kcalloc() to kvcalloc(). >> I can certainly spin up a patch to Linux to this effect. Yet that still >> won't answer the question of why this issue has popped up all of the >> sudden (and hence whether there are things wanting changing elsewhere >> as well). > > Also, I saw your patches for Linux. Let's say that the patches are > reviewed and enqueued immediately to be sent to Linus at the next > opportunity. It is going to take a while for them to take effect in > OSSTest, unless we import them somehow in the Linux tree used by OSSTest > straight away, right? Yes. > Should we arrange for one test OSSTest flight now with the patches > applied to see if they actually fix the issue? Otherwise we might end up > waiting for nothing... Not sure how easy it is to do one-off Linux builds then to be used in hypervisor tests. Ian? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.