[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] domU fails to boot with more than 32 vcpus on kernel 4.1+ - xen_netfront: can't alloc rx grant refs



On Wed, 2016-01-20 at 12:53 +0100, hydra wrote:
> Booting up a xen domU with more than 32 vcpus fails with kernel 4.1.15
> (also tested with 4.4.0). This works without problems on kernel 3.14.

You can work around this by using theÂgnttab_max_frames _hypervisor_
command line parameter[0] (calledÂgnttab_max_nr_frames in Xen 4.4 and
earlier) to increase the number of grant table pages available to guests
(try doubling it from 32 (default) to 64.

Apart from reducing the number of vcpus you can also use the max_queues
options to either netfront (in the guest) or netback (in dom0) to limit the
number of queues, although that was broken until recently (fixed in Linux
4.3, I think) and I'm not sure where it has been backported to.

I think the resulting crash which you report below has also been fixed in
4.3, did you see this with Linux 4.4.0 as well as 4.1.15 (hopefully not)?

Ian.

[0]Âhttp://xenbits.xen.org/docs/unstable/misc/xen-command-line.html

> 
> The domU configuration is:
> 
> name = "node"
> kernel = "kernel-4.1.15-gentoo-xen"
> extra = "root=/dev/xvda1 net.ifnames=0"
> memory = 5000
> vcpus = 40
> vif = [ '' ]
> disk = [
> '/dev/vg_data/node_root,raw,xvda1,rw'
> ]
> 
> # xl create -c node.cfg
> 
> ...
> [ÂÂÂ 0.407136] io scheduler deadline registered
> [ÂÂÂ 0.407209] io scheduler cfq registered (default)
> [ÂÂÂ 0.407840] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [ÂÂÂ 0.408174] Non-volatile memory driver v1.3
> [ÂÂÂ 0.408318] xen:xen_evtchn: Event-channel device installed
> [ÂÂÂ 0.408488] [drm] Initialized drm 1.1.0 20060810
> [ÂÂÂ 0.415157] loop: module loaded
> [ÂÂÂ 0.444874] blkfront: xvda1: barrier or flush: disabled; persistent
> grants: enabled; indirect descriptors: enabled;
> [ÂÂÂ 0.505838] tun: Universal TUN/TAP device driver, 1.6
> [ÂÂÂ 0.505862] tun: (C) 1999-2004 Max Krasnyansky <maxk@xxxxxxxxxxxx>
> [ÂÂÂ 0.506039] xen_netfront: Initialising Xen virtual ethernet driver
> [ÂÂÂ 0.521246] blkfront: xvda3: barrier or flush: disabled; persistent
> grants: enabled; indirect descriptors: enabled;
> [ÂÂÂ 0.533589] xen_netfront: can't alloc rx grant refs
> [ÂÂÂ 0.533612] net eth0: only created 31 queues
> [ÂÂÂ 0.534907] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000018
> [ÂÂÂ 0.534923] IP: [<ffffffff814e4553>] netback_changed+0x7b3/0xce0
> [ÂÂÂ 0.534941] PGD 0
> [ÂÂÂ 0.534948] Oops: 0000 [#1] SMP
> [ÂÂÂ 0.534960] CPU: 4 PID: 176 Comm: xenwatch Not tainted 4.1.15-gentoo
> #2
> [ÂÂÂ 0.534969] task: ffff88013001ad00 ti: ffff8801301a0000 task.ti:
> ffff8801301a0000
> [ÂÂÂ 0.534980] RIP: e030:[<ffffffff814e4553>]Â [<ffffffff814e4553>]
> netback_changed+0x7b3/0xce0
> [ÂÂÂ 0.534993] RSP: e02b:ffff8801301a3d88Â EFLAGS: 00010202
> [ÂÂÂ 0.535003] RAX: 0000000000000000 RBX: ffff8800fe8e8000 RCX:
> 0000000000000001
> [ÂÂÂ 0.535005] RDX: 0000000000000001 RSI: ffff8800fe9440f8 RDI:
> 0000000000003f41
> [ÂÂÂ 0.535005] RBP: ffff8801301a3e18 R08: ffffc90001680000 R09:
> ffffffff81932aa7
> [ÂÂÂ 0.535005] R10: ffffea0003fa3a80 R11: ffffea0004bf6000 R12:
> 0000000000044000
> [ÂÂÂ 0.535005] R13: ffff8800fe9440f8 R14: ffff8800fe8e9000 R15:
> ffff8800fe944000
> [ÂÂÂ 0.535005] FS:Â 0000000000000000(0000) GS:ffff880131e80000(0000)
> knlGS:0000000000000000
> [ÂÂÂ 0.535005] CS:Â e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ÂÂÂ 0.535005] CR2: 0000000000000018 CR3: 0000000001a0c000 CR4:
> 0000000000042660
> [ÂÂÂ 0.535005] Stack:
> [ÂÂÂ 0.535005]Â ffff8801301a3e08 ffff8800fe941e04 ffff8801301bac00
> ffff880000000001
> [ÂÂÂ 0.535005]Â ffff88013033f000 ffff8801301bac00 00000028301a3e14
> ffff880100000020
> [ÂÂÂ 0.535005]Â 0000000131e96c00 00000001307b8d60 0000002800000001
> ffff880100003f41
> [ÂÂÂ 0.535005] Call Trace:
> [ÂÂÂ 0.535005]Â [<ffffffff81458db8>] xenbus_otherend_changed+0x98/0xa0
> [ÂÂÂ 0.535005]Â [<ffffffff81457690>] ? find_watch+0x50/0x50
> [ÂÂÂ 0.535005]Â [<ffffffff8145a3ae>] backend_changed+0xe/0x10
> [ÂÂÂ 0.535005]Â [<ffffffff8145772f>] xenwatch_thread+0x9f/0x140
> [ÂÂÂ 0.535005]Â [<ffffffff81087b70>] ? __wake_up_common+0x90/0x90
> [ÂÂÂ 0.535005]Â [<ffffffff8106b704>] kthread+0xc4/0xe0
> [ÂÂÂ 0.535005]Â [<ffffffff81010000>] ? __xen_send_IPI_mask+0x30/0x50
> [ÂÂÂ 0.535005]Â [<ffffffff8106b640>] ? __kthread_parkme+0x80/0x80
> [ÂÂÂ 0.535005]Â [<ffffffff81649f22>] ret_from_fork+0x42/0x70
> [ÂÂÂ 0.535005]Â [<ffffffff8106b640>] ? __kthread_parkme+0x80/0x80
> [ÂÂÂ 0.535005] Code: 89 f7 e8 d1 d9 c1 ff 41 8b bd 58 01 00 00 31 f6 e8
> b3 bb f6 ff 31 f6 4c 89 ff e8 b9 d9 c1 ff e9 74 ff ff ff 49 8b 47 20 4c
> 89 ee <48> 8b 78 18 e8 94 1e f7 ff 85 c0 0f 88 d4 fd ff ff 49 8b 47 20
> [ÂÂÂ 0.535005] RIPÂ [<ffffffff814e4553>] netback_changed+0x7b3/0xce0
> [ÂÂÂ 0.535005]Â RSP <ffff8801301a3d88>
> [ÂÂÂ 0.535005] CR2: 0000000000000018
> [ÂÂÂ 0.535005] ---[ end trace e66fd859d9634ab5 ]---
> [ÂÂÂ 0.573507] xenwatch (176) used greatest stack depth: 13360 bytes left
> [ÂÂÂ 1.387133] clocksource tsc: mask: 0xffffffffffffffff max_cycles:
> 0x2ca24e83e81, max_idle_ns: 440795284478 ns
> [ÂÂÂ 1.538503] i8042: No controller found
> [ÂÂÂ 1.558800] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
> [ÂÂÂ 1.558930] rtc_cmos: probe of rtc_cmos failed with error -38
> [ÂÂÂ 1.559236] device-mapper: ioctl: 4.31.0-ioctl (2015-3-12)
> initialised: dm-devel@xxxxxxxxxx
> [ÂÂÂ 1.559300] hidraw: raw HID events driver (C) Jiri Kosina
> [ÂÂÂ 1.559687] Netfilter messages via NETLINK v0.30.
> [ÂÂÂ 1.559705] nfnl_acct: registering with nfnetlink.
> [ÂÂÂ 1.559738] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
> [ÂÂÂ 1.560111] ctnetlink v0.93: registering with nfnetlink.
> [ÂÂÂ 1.560469] xt_time: kernel timezone is -0000
> [ÂÂÂ 1.560495] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
> [ÂÂÂ 1.560524] IPVS: Connection hash table configured (size=4096,
> memory=64Kbytes)
> [ÂÂÂ 1.560585] IPVS: Creating netns size=2144 id=0
> [ÂÂÂ 1.560612] IPVS: ipvs loaded.
> [ÂÂÂ 1.560627] IPVS: [rr] scheduler registered.
> [ÂÂÂ 1.560638] IPVS: [wrr] scheduler registered.
> [ÂÂÂ 1.560651] IPVS: [lc] scheduler registered.
> [ÂÂÂ 1.560662] IPVS: [wlc] scheduler registered.
> [ÂÂÂ 1.560673] IPVS: [fo] scheduler registered.
> [ÂÂÂ 1.560690] IPVS: [lblc] scheduler registered.
> [ÂÂÂ 1.560705] IPVS: [lblcr] scheduler registered.
> [ÂÂÂ 1.560717] IPVS: [dh] scheduler registered.
> [ÂÂÂ 1.560728] IPVS: [sh] scheduler registered.
> [ÂÂÂ 1.560740] IPVS: [sed] scheduler registered.
> [ÂÂÂ 1.560751] IPVS: [nq] scheduler registered.
> [ÂÂÂ 1.560765] IPVS: [sip] pe registered.
> [ÂÂÂ 1.560909] ip_tables: (C) 2000-2006 Netfilter Core Team
> [ÂÂÂ 1.561060] ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
> [ÂÂÂ 1.561085] arp_tables: (C) 2002 David S. Miller
> [ÂÂÂ 1.561132] Initializing XFRM netlink socket
> [ÂÂÂ 1.561151] NET: Registered protocol family 17
> [ÂÂÂ 1.561484] registered taskstats version 1
> [ÂÂÂ 1.562651] Btrfs loaded
> [ÂÂÂ 1.563121] Key type encrypted registered
> [ÂÂÂ 6.663125] xenbus_probe_frontend: Waiting for devices to initialise:
> 25s...20s...15s...
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxx
> http://lists.xen.org/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.