[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Race condition in gntdev driver



Hi Daniel and Konrad,

In an email thread at the end of last year ('Questions about GPLPV
stability tests'), I mentioned a certain bug in the gntdev driver. I
think I found the bug and would like your opinion. Since Daniel did a
lot of work on gntdev recently, I think he is most familiar with the
code.

The systems on which I see this bug run Xen 4.1.2 and Linux 2.6.32.48
(pvops), but the same bug should be in Linux 3.x since the code is
similar.

Let me summarize the problem. At some point in time a DomU is
shutdown. Apparently shutdown takes too long according to Xend,
causing it to SIGKILL qemu-dm:
[2012-01-16 14:54:34 3419] INFO (XendDomainInfo:2078) Domain has
shutdown: name=win7-1 id=7581 reason=poweroff.
[2012-01-16 14:54:34 3419] DEBUG (XendDomainInfo:3071)
XendDomainInfo.destroy: domid=7581
[2012-01-16 14:54:36 3419] DEBUG (XendDomainInfo:2401) Destroying device model
[2012-01-16 14:54:46 3419] WARNING (image:649) DeviceModel 31742 took
more than 10s to terminate: sending SIGKILL

This triggers the following Oops:
[533804.785589] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000008
[533804.793913] IP: [<ffffffff814bdba9>] _spin_lock+0x5/0x15
[533804.799556] PGD 0
[533804.801896] Oops: 0002 [#1] SMP
[533804.805448] last sysfs file:
/sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/class
[533804.813736] CPU 0
[533804.816066] Modules linked in: dm_snapshot dm_mod ipmi_si ipmi_devintf
[533804.822914] Pid: 21632, comm: qemu-dm Not tainted 2.6.32.48 #1 X8STi
[533804.829595] RIP: e030:[<ffffffff814bdba9>]  [<ffffffff814bdba9>]
_spin_lock+0x5/0x15
[533804.837873] RSP: e02b:ffff88005f187c50  EFLAGS: 00010202
[533804.843513] RAX: 0000000000000100 RBX: ffff88007d6923c0 RCX:
0000000000000003
[533804.851192] RDX: ffff88007deb3180 RSI: ffff88007d6923c0 RDI:
0000000000000008
[533804.858871] RBP: ffff88007e260128 R08: 0000000000000000 R09:
0000000000000000
[533804.866552] R10: ffff88007121eb40 R11: ffffffff811b862b R12:
ffff88007fac1e40
[533804.874541] R13: ffff88007e3c3340 R14: ffff88007e3c3340 R15:
ffff88007fdfbc00
[533804.882243] FS:  00007f5cdcefe6f0(0000) GS:ffff880028038000(0000)
knlGS:0000000000000000
[533804.890874] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[533804.896948] CR2: 0000000000000008 CR3: 0000000001001000 CR4:
0000000000002660
[533804.904627] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[533804.912306] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[533804.919985] Process qemu-dm (pid: 21632, threadinfo
ffff88005f186000, task ffff880074f4e270)
[533804.928971] Stack:
[533804.931312]  ffffffff810a9ad0 ffff88007deb3180 ffff88007e260100
ffff88007e260100
[533804.938762] <0> ffffffff8124222d 0000000000000008 0000000000000008
ffff88007deb3180
[533804.946900] <0> ffffffff810b245e ffff88007fac1e40 ffff88007deb3180
0000000000000000
[533804.955465] Call Trace:
[533804.958244]  [<ffffffff810a9ad0>] ? mmu_notifier_unregister+0x27/0xa5
[533804.965019]  [<ffffffff8124222d>] ? gntdev_release+0xc3/0xd1
[533804.971007]  [<ffffffff810b245e>] ? __fput+0x100/0x1af
[533804.976477]  [<ffffffff810af998>] ? filp_close+0x5b/0x62
[533804.982119]  [<ffffffff81042989>] ? put_files_struct+0x64/0xc1
[533804.988280]  [<ffffffff810441f0>] ? do_exit+0x209/0x68d
[533804.993836]  [<ffffffff810441f8>] ? do_exit+0x211/0x68d
[533804.999390]  [<ffffffff810446e9>] ? do_group_exit+0x75/0x9c
[533805.005294]  [<ffffffff8104cf1d>] ? get_signal_to_deliver+0x30d/0x338
[533805.012063]  [<ffffffff81010f7a>] ? do_notify_resume+0x8a/0x6b1
[533805.018310]  [<ffffffff810bdb3a>] ? poll_select_copy_remaining+0xd0/0xf3
[533805.025339]  [<ffffffff81011c8e>] ? int_signal+0x12/0x17
[533805.030980] Code: 00 00 00 01 74 05 e8 67 1c d2 ff 48 89 d0 5e c3
ff 14 25 20 2d 6a 81 f0 81 2f 00 00 00 01 74 05 e8 4d 1c d2 ff c3 b8
00 01 00 00 <f0> 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3 f0 81 2f
00 00
[533805.050454] RIP  [<ffffffff814bdba9>] _spin_lock+0x5/0x15
[533805.056182]  RSP <ffff88005f187c50>
[533805.059997] CR2: 0000000000000008
[533805.063638] ---[ end trace 85ee7cbec9ce72ac ]---
[533805.068584] Fixing recursive fault but reboot is needed!


Below I attached the in my opinion relevant part of the gntdev driver:
static int gntdev_open(struct inode *inode, struct file *flip)
{
...
...
        priv->mm = get_task_mm(current);
        if (!priv->mm) {
                kfree(priv);
                return -ENOMEM;
        }
        priv->mn.ops = &gntdev_mmu_ops;
        mmu_notifier_register(&priv->mn, priv->mm);
        mmput(priv->mm);

        flip->private_data = priv;
...
}

static int gntdev_release(struct inode *inode, struct file *flip)
{
        struct gntdev_priv *priv = flip->private_data;
...
        mmu_notifier_unregister(&priv->mn, priv->mm);
        kfree(priv);
        return 0;
}

I think the bug is due to gntdev not handling reference counting
(mm->mm_users) in combination with a race condition. The function
gntdev_open calls get_task_mm (which increases mm_users) but straight
after storing 'mm' in 'priv' it performs a mmput which again decreases
the reference count. Since mmu_notifier_register also increases the
reference count, the count is definitely positive after gntdev_open. I
suspect that due to a race condition (after 10 seconds, xend performed
a SIGKILL) the mm structure already got freed somehow. This causes a
bug in mmu_notifier_unregister.

If this is correct then the the 'mmput' line should be moved to
gntdev_release and placed after mmu_notifier_unregister. What do you
think? If this is correct, I will send a patch for this.

Regards,
Roderick Colenbrander

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.