[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re[2]: User domain starts with a crash loop when memory configured is above 500GB



Thank you Juergen for the tip. It was dead on. I made those changes and I am able to boot larger user domains than 500GB and also the kernel crash messages went away.

Cheers,
Robert


------ Original Message ------
From "Juergen Gross" <jgross@xxxxxxxx>
To "Robert Polasek" <polasekr@xxxxxxxxx>; xen-users@xxxxxxxxxxxxxxxxxxxx
Date 2023-09-19 10:35:24
Subject Re: User domain starts with a crash loop when memory configured is above 500GB

On 18.09.23 17:34, Robert Polasek wrote:
Hi everybody,

I have a server with 760GB of RAM. I have only domain 0 running there with 16GB 
of ram assigned to it.

Here is a configuration for my user domain:

name = "node01"
kernel = "/boot/vmlinuz-5.15.0-82-generic"
root = "/dev/xvda"
memory = 614400
maxmem = 614400
vcpus = 32
maxvcpus = 32
disk = ['file:/vserver/images/node01.img,xvda,w']
vif = ['bridge=virbr0,mac=00:16:3e:01:01:02']
iommu = "soft"
swiotlb = "force"
pci_permissive = 1
pci = 
['0000:3e:00.0','0000:3f:00.0','0000:40:00.0','0000:41:00.0','0000:b1:00.0','0000:b2:00.0']

nics = 1
dhcp = "off"
ip = "192.168.122.15"
netmask = "255.255.255.0"
gateway = "192.168.122.1"
hostname = "node01"

extra="3"

When I try to start the domain, it spins in a crash loop with following error 
messages:

[ 6864.140170] WARNING: CPU: 2 PID: 266 at arch/x86/xen/multicalls.c:102 
xen_mc_flush+0x197/0x200
[ 6864.140183] Modules linked in:
[ 6864.140190] CPU: 2 PID: 266 Comm: xen-balloon Tainted: G      D W          
5.15.0-82-generic #91-Ubuntu
[ 6864.140203] RIP: e030:xen_mc_flush+0x197/0x200
[ 6864.140212] Code: 77 65 89 c0 48 c1 e0 05 48 05 00 20 00 81 ff d0 0f 1f 00 49 89 
45 18 48 85 c0 0f 89 17 ff ff ff 45 8b 4d 00 41 bf 01 00 00 00 <0f> 0b 48 c7 c7 
f0 8e 5b 82 44 89 ca 44 89 fe 45 31 f6 65 8b 0d e8
[ 6864.140234] RSP: e02b:ffffc90041027b88 EFLAGS: 00010002
[ 6864.140243] RAX: 0000000000000001 RBX: 0000000000000040 RCX: 0000000000000000
[ 6864.140253] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff89009809e310
[ 6864.140264] RBP: ffffc90041027bb8 R08: ffff888168dc0000 R09: 0000000000000002
[ 6864.140275] R10: 0000000000000200 R11: ffff8900980b7690 R12: 0000000000000000
[ 6864.140286] R13: ffff89009809e300 R14: 0000000000000002 R15: 0000000000000001
[ 6864.140303] FS:  0000000000000000(0000) GS:ffff890098080000(0000) 
knlGS:0000000000000000
[ 6864.140315] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6864.140324] CR2: 0000000000000000 CR3: 0000000002e10000 CR4: 0000000000050660
[ 6864.140339] Call Trace:
[ 6864.140344]  <TASK>
[ 6864.140349]  ? __raw_callee_save_xen_make_pte+0x15/0x27
[ 6864.140359]  xen_mc_issue+0x61/0x80
[ 6864.140367]  xen_alloc_pte+0xd8/0x290
[ 6864.140376]  pmd_populate_kernel.constprop.0+0x4b/0xa0
[ 6864.140387]  vmemmap_pmd_populate+0x69/0x79
[ 6864.140395]  vmemmap_populate_basepages+0x68/0xb3
[ 6864.140405]  vmemmap_populate+0x2a/0xa9
[ 6864.140412]  __populate_section_memmap+0x3c/0x57
[ 6864.140422]  sparse_add_section+0x12b/0x1dc
[ 6864.140431]  __add_pages+0xac/0x150
[ 6864.140440]  add_pages+0x17/0x70
[ 6864.140447]  arch_add_memory+0x45/0x60
[ 6864.140455]  add_memory_resource+0x12c/0x320
[ 6864.140467]  reserve_additional_memory+0x10f/0x160
[ 6864.140476]  balloon_thread+0x337/0x500
[ 6864.140483]  ? wait_woken+0x70/0x70
[ 6864.140492]  ? reserve_additional_memory+0x160/0x160
[ 6864.140501]  kthread+0x127/0x150
[ 6864.140509]  ? set_kthread_struct+0x50/0x50
[ 6864.140518]  ret_from_fork+0x1f/0x30
[ 6864.140528]  </TASK>
[ 6864.140533] ---[ end trace 3bca9737718a46b2 ]---
[ 6864.140541] 1 of 2 multicall(s) failed: cpu 2
[ 6864.140549]   call  2: op=26 arg=[ffff89009809eb10] result=-22

Any suggestion what I am doing wrong? There should be plenty of RAM to start 
600GB domain. I can start  user domain with 500GB no problem. Thank you in 
advance for your help and suggestions.

I think your kernel has been configured with CONFIG_XEN_512GB.

You should try to add "xen_512gb_limit=0" to your guest's command line.

Even if this is fixing your boot issue, the guest shouldn't show the error
you are seeing.


Juergen




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.