[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Trying to unmap invalid handle! pending_idx: @ drivers/net/xen-netback/netback.c:998 causes kernel panic/reboot
Dear Xen Developers!We're running Xen on multiple machines, most of them are Dell R410 or SM X8DTL, with one E5645 cpu, and 48 GB of RAM. We've update the kernel to 3.15.4, after the some of our hypervisors started to rebooting at random times. The logs were empty, and we have no information about the crashes, we've tried some tricks, and at the end the netconsole kernel modul helped, so we can do a very thin layer of remote kernel logging. We've found the following in the remote logs: Jul 13 00:46:58 node11 [157060.106323] vif vif-2-0 h14z4mzbvfrrhb: Trying to unmap invalid handle! pending_idx: c Jul 13 00:46:58 node11 [157060.106476] ------------[ cut here ]------------Jul 13 00:46:58 node11 [157060.106546] kernel BUG at drivers/net/xen-netback/netback.c:998! Jul 13 00:46:58 node11 [157060.106616] invalid opcode: 0000 [#1] Jul 13 00:46:58 SMP Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.106765] Modules linked in: Jul 13 00:46:58 node11 netconsole Jul 13 00:46:58 node11 configfs Jul 13 00:46:58 node11 nf_conntrack_ipv4 Jul 13 00:46:58 node11 nf_defrag_ipv4 Jul 13 00:46:58 node11 xt_multiport Jul 13 00:46:58 node11 xt_physdev Jul 13 00:46:58 node11 sch_tbf Jul 13 00:46:58 node11 dm_snapshot Jul 13 00:46:58 node11 dm_bufio Jul 13 00:46:58 node11 arptable_filter Jul 13 00:46:58 node11 arp_tables Jul 13 00:46:58 node11 ip6t_REJECT Jul 13 00:46:58 node11 ip6table_mangle Jul 13 00:46:58 node11 ipt_REJECT Jul 13 00:46:58 node11 iptable_filter Jul 13 00:46:58 node11 ip_tables Jul 13 00:46:58 node11 bridge Jul 13 00:46:58 node11 xen_pciback Jul 13 00:46:58 node11 xen_gntalloc Jul 13 00:46:58 node11 autofs4 Jul 13 00:46:58 node11 dm_round_robin Jul 13 00:46:58 node11 scsi_dh_alua Jul 13 00:46:58 node11 8021q Jul 13 00:46:58 node11 mrp Jul 13 00:46:58 node11 garp Jul 13 00:46:58 node11 stp Jul 13 00:46:58 node11 llc Jul 13 00:46:58 node11 bonding Jul 13 00:46:58 node11 xt_tcpudp Jul 13 00:46:58 node11 nf_conntrack_ipv6 Jul 13 00:46:58 node11 nf_defrag_ipv6 Jul 13 00:46:58 node11 xt_state Jul 13 00:46:58 node11 nf_conntrack Jul 13 00:46:58 node11 ip6table_filter Jul 13 00:46:58 node11 ip6_tables Jul 13 00:46:58 node11 x_tables Jul 13 00:46:58 node11 bnx2i Jul 13 00:46:58 node11 cnic Jul 13 00:46:58 node11 uio Jul 13 00:46:58 node11 cxgb4i Jul 13 00:46:58 node11 cxgb4 Jul 13 00:46:58 node11 cxgb3i Jul 13 00:46:58 node11 libcxgbi Jul 13 00:46:58 node11 cxgb3 Jul 13 00:46:58 node11 mdio Jul 13 00:46:58 node11 ib_iser Jul 13 00:46:58 node11 rdma_cm Jul 13 00:46:58 node11 ib_cm Jul 13 00:46:58 node11 iw_cm Jul 13 00:46:58 node11 ib_sa Jul 13 00:46:58 node11 ib_mad Jul 13 00:46:58 node11 ib_core Jul 13 00:46:58 node11 ib_addr Jul 13 00:46:58 node11 iscsi_tcp Jul 13 00:46:58 node11 libiscsi_tcp Jul 13 00:46:58 node11 binfmt_misc Jul 13 00:46:58 node11 dm_mirror Jul 13 00:46:58 node11 dm_region_hash Jul 13 00:46:58 node11 dm_log Jul 13 00:46:58 node11 dm_multipath Jul 13 00:46:58 node11 scsi_dh Jul 13 00:46:58 node11 xen_netback Jul 13 00:46:58 node11 xen_blkback Jul 13 00:46:58 node11 xen_gntdev Jul 13 00:46:58 node11 xen_evtchn Jul 13 00:46:58 node11 xenfs Jul 13 00:46:58 node11 xen_privcmd Jul 13 00:46:58 node11 ufs Jul 13 00:46:58 node11 gpio_ich Jul 13 00:46:58 node11 iTCO_wdt Jul 13 00:46:58 node11 iTCO_vendor_support Jul 13 00:46:58 node11 joydev Jul 13 00:46:58 node11 psmouse Jul 13 00:46:58 node11 serio_raw Jul 13 00:46:58 node11 pcspkr Jul 13 00:46:58 node11 tpm_infineon Jul 13 00:46:58 node11 i2c_i801 Jul 13 00:46:58 node11 lpc_ich Jul 13 00:46:58 node11 hid_generic Jul 13 00:46:58 node11 e1000e Jul 13 00:46:58 node11 ptp Jul 13 00:46:58 node11 pps_core Jul 13 00:46:58 node11 snd_hda_intel Jul 13 00:46:58 node11 snd_hda_controller Jul 13 00:46:58 node11 snd_hda_codec Jul 13 00:46:58 node11 snd_hwdep Jul 13 00:46:58 node11 snd_seq Jul 13 00:46:58 node11 snd_seq_device Jul 13 00:46:58 node11 snd_pcm Jul 13 00:46:58 node11 snd_timer Jul 13 00:46:58 node11 snd Jul 13 00:46:58 node11 soundcore Jul 13 00:46:58 node11 ioatdma Jul 13 00:46:58 node11 dca Jul 13 00:46:58 node11 mac_hid Jul 13 00:46:58 node11 i7core_edac Jul 13 00:46:58 node11 edac_core Jul 13 00:46:58 node11 be2iscsi Jul 13 00:46:58 node11 iscsi_boot_sysfs Jul 13 00:46:58 node11 libiscsi Jul 13 00:46:58 node11 scsi_transport_iscsi Jul 13 00:46:58 node11 be2net Jul 13 00:46:58 node11 vxlan Jul 13 00:46:58 node11 ahci(E) Jul 13 00:46:58 node11 libahci(E) Jul 13 00:46:58 node11 usbhid(E) Jul 13 00:46:58 node11 hid(E) Jul 13 00:46:58 node11 [last unloaded: evbug] Jul 13 00:46:58 node11Jul 13 00:46:58 node11 [157060.112705] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G E 3.15.4 #1 Jul 13 00:46:58 node11 [157060.112776] Hardware name: Supermicro X8DTL/X8DTL, BIOS 1.1b 03/19/2010 Jul 13 00:46:58 node11 [157060.112848] task: ffffffff81c1b480 ti: ffffffff81c00000 task.ti: ffffffff81c00000 Jul 13 00:46:58 node11 [157060.112936] RIP: e030:[<ffffffffa025f61d>]Jul 13 00:46:58 node11 [<ffffffffa025f61d>] xenvif_idx_unmap+0x11d/0x130 [xen_netback] Jul 13 00:46:58 node11 [157060.113078] RSP: e02b:ffff88008ea03d48 EFLAGS: 00010292 Jul 13 00:46:58 node11 [157060.113147] RAX: 000000000000004a RBX: 000000000000000c RCX: 0000000000000000 Jul 13 00:46:58 node11 [157060.113234] RDX: ffff88008a40b600 RSI: ffff88008ea03a18 RDI: 000000000000021b Jul 13 00:46:58 node11 [157060.113321] RBP: ffff88008ea03d88 R08: 0000000000000000 R09: ffff88008a40b600 Jul 13 00:46:58 node11 [157060.113408] R10: ffff88008a0004e8 R11: 00000000000006d8 R12: ffff8800569708c0 Jul 13 00:46:58 node11 [157060.113495] R13: ffff88006558fec0 R14: ffff8800569708c0 R15: 0000000000000001 Jul 13 00:46:58 node11 [157060.113589] FS: 00007f351684b700(0000) GS:ffff88008ea00000(0000) knlGS:0000000000000000 Jul 13 00:46:58 node11 [157060.113679] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 13 00:46:58 node11 [157060.113747] CR2: 00007fc2a4372000 CR3: 00000000049f3000 CR4: 0000000000002660 Jul 13 00:46:58 node11 [157060.113835] Stack: Jul 13 00:46:58 node11 [157060.113896] ffff880056979f90 Jul 13 00:46:58 node11 ff00000000000001 Jul 13 00:46:58 node11 ffff880b0605e000 Jul 13 00:46:58 node11 0000000000000000 Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.114143] ffff0000ffffffff Jul 13 00:46:58 node11 00000000fffffff6 Jul 13 00:46:58 node11 0000000000000001 Jul 13 00:46:58 node11 ffff8800569769d0 Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.114390] ffff88008ea03e58 Jul 13 00:46:58 node11 ffffffffa02622fc Jul 13 00:46:58 node11 ffff88008ea03dd8 Jul 13 00:46:58 node11 ffffffff810b5223 Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.114637] Call Trace: Jul 13 00:46:58 node11 [157060.114700] <IRQ> Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.114750]Jul 13 00:46:58 node11 [<ffffffffa02622fc>] xenvif_tx_action+0x27c/0x7f0 [xen_netback] Jul 13 00:46:58 node11 [157060.114927] [<ffffffff810b5223>] ? __wake_up+0x53/0x70 Jul 13 00:46:58 node11 [157060.114998] [<ffffffff810ca077>] ? handle_irq_event_percpu+0xa7/0x1b0 Jul 13 00:46:58 node11 [157060.115073] [<ffffffffa02647d1>] xenvif_poll+0x31/0x64 [xen_netback] Jul 13 00:46:58 node11 [157060.115147] [<ffffffff81653d4b>] net_rx_action+0x10b/0x290 Jul 13 00:46:58 node11 [157060.115221] [<ffffffff81071c73>] __do_softirq+0x103/0x320 Jul 13 00:46:58 node11 [157060.115292] [<ffffffff81072015>] irq_exit+0x135/0x140 Jul 13 00:46:58 node11 [157060.115363] [<ffffffff8144759c>] xen_evtchn_do_upcall+0x3c/0x50 Jul 13 00:46:58 node11 [157060.115436] [<ffffffff8175c07e>] xen_do_hypervisor_callback+0x1e/0x30 Jul 13 00:46:58 node11 [157060.115506] <EOI> Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.115551]Jul 13 00:46:58 node11 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 Jul 13 00:46:58 node11 [157060.115722] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 Jul 13 00:46:58 node11 [157060.115794] [<ffffffff8100a200>] ? xen_safe_halt+0x10/0x20 Jul 13 00:46:58 node11 [157060.115869] [<ffffffff8101dbbf>] ? default_idle+0x1f/0xc0 Jul 13 00:46:58 node11 [157060.115939] [<ffffffff8101d38f>] ? arch_cpu_idle+0xf/0x20 Jul 13 00:46:58 node11 [157060.116009] [<ffffffff810b5aa1>] ? cpu_startup_entry+0x201/0x360 Jul 13 00:46:58 node11 [157060.116084] [<ffffffff817420a7>] ? rest_init+0x77/0x80 Jul 13 00:46:58 node11 [157060.116156] [<ffffffff81d3a156>] ? start_kernel+0x406/0x413 Jul 13 00:46:58 node11 [157060.116227] [<ffffffff81d39b6e>] ? repair_env_string+0x5b/0x5b Jul 13 00:46:58 node11 [157060.116298] [<ffffffff81d39603>] ? x86_64_start_reservations+0x2a/0x2c Jul 13 00:46:58 node11 [157060.116373] [<ffffffff81d3d5dc>] ? xen_start_kernel+0x584/0x586 Jul 13 00:46:58 node11 [157060.116446] Code: Jul 13 00:46:58 41 Jul 13 00:46:58 5c Jul 13 00:46:58 c9 Jul 13 00:46:58 c3 Jul 13 00:46:58 48 Jul 13 00:46:58 be Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 80 Jul 13 00:46:58 ff Jul 13 00:46:58 77 Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 e9 Jul 13 00:46:58 62 Jul 13 00:46:58 ff Jul 13 00:46:58 ff Jul 13 00:46:58 ff Jul 13 00:46:58 49 Jul 13 00:46:58 8b Jul 13 00:46:58 bc Jul 13 00:46:58 24 Jul 13 00:46:58 78 Jul 13 00:46:58 ba Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 89 Jul 13 00:46:58 da Jul 13 00:46:58 48 Jul 13 00:46:58 c7 Jul 13 00:46:58 c6 Jul 13 00:46:58 30 Jul 13 00:46:58 57 Jul 13 00:46:58 26 Jul 13 00:46:58 a0 Jul 13 00:46:58 31 Jul 13 00:46:58 c0 Jul 13 00:46:58 e8 Jul 13 00:46:58 73 Jul 13 00:46:58 df Jul 13 00:46:58 3e Jul 13 00:46:58 e1 Jul 13 00:46:58 node11 f> Jul 13 00:46:58 0b Jul 13 00:46:58 eb Jul 13 00:46:58 fe Jul 13 00:46:58 66 Jul 13 00:46:58 66 Jul 13 00:46:58 66 Jul 13 00:46:58 66 Jul 13 00:46:58 66 Jul 13 00:46:58 66 Jul 13 00:46:58 2e Jul 13 00:46:58 0f Jul 13 00:46:58 1f Jul 13 00:46:58 84 Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 00 Jul 13 00:46:58 55 Jul 13 00:46:58 48 Jul 13 00:46:58 node11 Jul 13 00:46:58 node11 [157060.119179] RIPJul 13 00:46:58 node11 [<ffffffffa025f61d>] xenvif_idx_unmap+0x11d/0x130 [xen_netback] Jul 13 00:46:58 node11 [157060.119312] RSP <ffff88008ea03d48> Jul 13 00:46:58 node11 [157060.119395] ---[ end trace 7e021c96c8cfea53 ]---Jul 13 00:46:58 node11 [157060.119465] Kernel panic - not syncing: Fatal exception in interrupt h14z4mzbvfrrhb was a name of a VIF. This VIF belongs to a Windows Server 2008 R2 X64 virtual machine. We had 6 random reboots until now, all of the VIFs are belonged to the same operating system, but different virtual machines. So only Windows Server 2008 R2 X64 system's virtual interfaces caused the crashes, these systems has been provisioned from different installs or templates. The GPLPV driver's versions are also different. XM info output: [root@c2-node11 ~]# xm info host : c2-node11 release : 3.15.4 version : #1 SMP Tue Jul 8 17:58:26 CEST 2014 machine : x86_64 nr_cpus : 12 nr_nodes : 1 cores_per_socket : 6 threads_per_core : 2 cpu_mhz : 2400hw_caps : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 49143 free_memory : 41830 free_cpus : 0 xen_major : 4 xen_minor : 2 xen_extra : .4-33.el6xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable xen_commandline : dom0_mem=3145728 pcie_aspm=off noreboot=true cc_compiler : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4) cc_compile_by : mockbuild cc_compile_domain : centos.org cc_compile_date : Mon Jun 16 17:22:14 UTC 2014 xend_config_format : 4 [root@c2-node11 ~]# uname -aLinux c2-node11 3.15.4 #1 SMP Tue Jul 8 17:58:26 CEST 2014 x86_64 x86_64 x86_64 GNU/Linux The xm create config file of the specified VM (the other VM's config files are the same): kernel = "/usr/lib/xen/boot/hvmloader" device_model = "/usr/lib64/xen/bin/qemu-dm" builder = "hvm" memory = "2000" name = "vna3mhwnv9pn4m" vcpus = "1" timer_mode = "2" viridian = "1"vif = [ "type=ioemu, mac=00:16:3e:64:c8:ba, bridge=x0evss6g1ztoa4, ip=..., vifname=h14z4mzbvfrrhb, rate=100Mb/s" ] disk = [ "phy:/dev/q7jiqc2gh02b2b/xz7wget4ycmp77,ioemu:hda,w" ] vnc = 1 vncpasswd="aaaaa1" usbdevice="tablet" The HV's networking looks as the following:We are using dual emulex 10gbit network adapters, with bonding (LACP), and on the top of the bond, we're using VLAN's for the VM, management and the iSCSI traffic. We're tried to reproduce the error, but we couldn't, the crash/reboot happened randomly every time. Thanks, for your help, - Armin Zentai _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |