[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue on unmap_grant_pages before releasing dmabuf


  • To: Juergen Gross <jgross@xxxxxxxx>
  • From: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>
  • Date: Wed, 19 Oct 2022 12:46:07 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=S9WgTCEAkpuz8hI3Gi58gMbr+/FmkVsfuGpafWrHMcw=; b=Da5mTb+pH7a9VDxmomQu9fDgLXSObXMYLWjkBNVC8p0RSLLsBqoNqe7JMiVYHl8IPYwsxoPrWxg7OTArgoPaxCAnFzL55rjb6dt28hVky9UFxxnVIc6rgUZd8CHlxZMwEzewpMPerrm0pT2eUCIxccI3XzLNPG8v/9FFOiPykAz3X8pX78JzA0Yfkb6lNHxjzeJWiG+RYnD64jfIkUVUw74inOHb3xG5gbAfHyWB6WqcqY1QDvgf2q+HCRBgpH2TtQNFIF6encLClq1WXZ/v1CDubhNc47ZJi7PGVFXwJww2MP5bjhHMRFLccKXAtqpI4sX/LY8lkH0WxIAqA/zsuA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=h6h3TxBzyzm0CAHxPXQ6tRB8vMEafiMMSak1RHc/8gPDtm1WKvujvR/QEQwwXpUURIgi0VTC4iOYqsT40Yu5HysJsgaI2bAc2v+mmhHWCltBzt+J+4bQeJAxel9pl9s2ydVgUOpekKR+Yw0zqx6xQPoIMejw9h3Ql7W1Fc5dPy+/fo8nx3617Im6PFSYaKojKSoiekt0lmgW+3g+BerB/ict0/xCExO4oto2OkGbeD9oXVyBrnvUqPo/nxuCF9Xh9Pc0LOaFCP5n4tdO3c3uYGhyywDmfanYKm3/pja7EUyVxxjv+Fd2Zhlu45jvzZjMioE0Fw1FzAMb5+SUxNaQuw==
  • Cc: "jennifer.herbert@xxxxxxxxxx" <jennifer.herbert@xxxxxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "maarten.lankhorst@xxxxxxxxxxxxxxx" <maarten.lankhorst@xxxxxxxxxxxxxxx>, "mripard@xxxxxxxxxx" <mripard@xxxxxxxxxx>, "tzimmermann@xxxxxxx" <tzimmermann@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "dri-devel@xxxxxxxxxxxxxxxxxxxxx" <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 19 Oct 2022 12:46:32 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY46egGI7L6jjRBkOb9R/tnZLb7K4VpcKAgAAEcoA=
  • Thread-topic: Issue on unmap_grant_pages before releasing dmabuf

On Wed, Oct 19, 2022 at 02:30:13PM +0200, Juergen Gross wrote:
> On 19.10.22 12:43, Oleksii Moisieiev wrote:
> > Greetings.
> > 
> > I need your advise about the problem I'm facing right now:
> > I'm working on the new type of dmabuf export implementation. This
> > is going to be new ioctl to the gntdev and it's purpose is to use
> > external buffer, provided by file descriptor as the backing storage
> > during export to grant refs.
> > Few words about the functionality I'm working on right now:
> > My setup is based on IMX8QM (please see PS below if you need
> > configuration details)
> > 
> > When using dma-buf exporter to create dma-buf with backing storage and
> > map it to the grant refs, provided from the domain, we've met a problem,
> > that several HW (i.MX8 gpu in our case) do not support external buffer
> > and requires backing storage to be created using it's native tools
> > (eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not
> > created using gbm_bo_create).
> > That's why new ioctls were added to be able to pass existing dma-buffer
> > fd as input parameter and use it as backing storage to export to refs.
> > Kernel version on IMX8QM board is 5.10.72 and itworks fine on this kernel
> > version.
> > 
> > New ioctls source code can be found here:
> >   
> > https://github.com/oleksiimoisieiev/linux/tree/gntdev_map_buf_upstr_for-linus-6.1_2
> > Now regarding the problem I've met when rebased those code on master:
> > On my test stand I use 
> > DRM_IOCTL_MODE_CREATE_DUMB/DRM_IOCTL_MODE_DESTROY_DUMB ioctls
> > to allocate buffer and I'm observing the following backtrace on 
> > DRM_IOCTL_MODE_DESTROY_DUMB:
> > 
> > Unable to handle kernel paging request at virtual address 0000000387000098
> > Mem abort info:
> >    ESR = 0x0000000096000005
> >    EC = 0x25: DABT (current EL), IL = 32 bits
> >    SET = 0, FnV = 0
> >    EA = 0, S1PTW = 0
> >    FSC = 0x05: level 1 translation fault
> > Data abort info:
> >    ISV = 0, ISS = 0x00000005
> >    CM = 0, WnR = 0
> > user pgtable: 4k pages, 48-bit VAs, pgdp=000000006df98000
> > [0000000387000098] pgd=0800000064f4f003, p4d=0800000064f4f003, 
> > pud=0000000000000000
> > Internal error: Oops: 96000005 [#1] PREEMPT SMP
> > Modules linked in: xen_pciback overlay crct10dif_ce ip_tables x_tables ipv6
> > PU: 0 PID: 34 Comm: kworker/0:1 Not tainted 6.0.0 #85
> > Hardware name: linux,dummy-virt (DT)
> > Workqueue: events virtio_gpu_dequeue_ctrl_func
> > pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : check_move_unevictable_folios+0xb8/0x4d0
> > lr : check_move_unevictable_folios+0xb4/0x4d0
> > sp : ffff8000081a3ad0
> > x29: ffff8000081a3ad0 x28: ffff03856ac98800 x27: 0000000000000000
> > x26: ffffde7b168ee9d8 x25: ffff03856ae26008 x24: 0000000000000000
> > x23: ffffde7b1758d6c0 x22: 0000000000000001 x21: ffff8000081a3b68
> > x20: 0000000000000001 x19: fffffc0e15935040 x18: ffffffffffffffff
> > x17: ffff250a68e3d000 x16: 0000000000000012 x15: ffff8000881a38d7
> > x14: 0000000000000000 x13: ffffde7b175a3150 x12: 0000000000002c55
> > x11: 0000000000000ec7 x10: ffffde7b176113f8 x9 : ffffde7b175a3150
> > x8 : 0000000100004ec7 x7 : ffffde7b175fb150 x6 : ffff8000081a3b70
> > x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffff03856ac98850
> > x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000387000000
> > Call trace:
> >   check_move_unevictable_folios+0xb8/0x4d0
> >   check_move_unevictable_pages+0x8c/0x110
> >   drm_gem_put_pages+0x118/0x198
> >   drm_gem_shmem_put_pages_locked+0x4c/0x70
> >   drm_gem_shmem_unpin+0x30/0x50
> >   virtio_gpu_cleanup_object+0x84/0x130
> >   virtio_gpu_cmd_unref_cb+0x18/0x2c
> >   virtio_gpu_dequeue_ctrl_func+0x124/0x290
> >   process_one_work+0x1d0/0x320
> >   worker_thread+0x14c/0x444
> >   kthread+0x10c/0x110
> >   ret_from_fork+0x10/0x20
> > Code: 97fc3fe1 aa1303e0 94003ac7 b4000080 (f9404c00)
> > ---[ end trace 0000000000000000 ]---
> > 
> > After some investigation I think I've found the cause of the problem:
> > This is the functionality, added in commit 
> > 3f9f1c67572f5e5e6dc84216d48d1480f3c4fcf6 which
> > introduces safe mechanism to unmap grant pages which is waiting until 
> > page_count(page) = 1
> > before doing unmap.
> > The problem is that DRM_IOCTL_MODE_CREATE_DUMB creates buffer where 
> > page_count(page) = 2.
> > 
> > On my QEMU test stand I'm using Xen 4.16 (aarch64) with debian based Dom0 + 
> > DomU on the latest
> > kernels.
> > I've created some apps for testing:
> > The first one is to allocate grant refs on DomU:
> > https://github.com/oleksiimoisieiev/linux/tree/gntdev_map_buf_upstr_for-linus-6.1_2
> > The name is test.ko and it can be built using command:
> > cd ./test; make
> > NOTE: makefile expects kernel build to be present on ../../test-build 
> > directory.
> > 
> > It should be run on DomU using command:
> > insmod test.ko; echo "create" > /sys/kernel/etx_sysfs/etx_value
> > 
> > Result will be the following:
> > [  126.104903] test: loading out-of-tree module taints kernel.
> > [  126.150586] Sysfs - Write!!!
> > [  126.150773] create
> > [  126.150773]
> > [  126.150888] Hello, World!
> > [  126.151203] Hello, World!
> > [  126.151324] gref 301
> > [  126.151376] addr ffff00000883d000
> > [  126.151431] gref 302
> > [  126.151454] addr ffff00000883e000
> > [  126.151478] gref 303
> > [  126.151497] addr ffff00000883f000
> > [  126.151525] gref 304
> > [  126.151546] addr ffff000008840000
> > [  126.151573] gref 305
> > [  126.151593] addr ffff000008841000
> > 
> > The second is for dom0 and can be found here:
> > https://github.com/oleksiimoisieiev/xen/tree/gntdev_fd
> > 
> > How to build:
> > make -C tools/console all
> > 
> > Result: ./tools/console/gnt_test should be uploaded to Dom0
> > 
> > Start: sudo ./gnt_test_map 1 301 302 303 304 305
> > Where 1 is DomU ID and 301 302 303 304 305 - grefs from test.ko output
> > 
> > This will create buffer using ioctls DRM_IOCTL_MODE_CREATE_DUMB them passes 
> > it as backing
> > storage to gntdev and then destroys it using DRM_IOCTL_MODE_DESTROY_DUMB.
> > The problem is that when dumb buffer is created we observe page_count(page) 
> > = 2. So
> > when before buffer release I'm trying to unmap grant refs using 
> > unmap_grant_pages it is calling
> > __gnttab_unmap_refs_async, which postpones actual unmapping to 5 ms because
> > page_count(page) > 1.
> > Which causes drm_gem_get_pages to try to free pages, which are still mapped.
> > Also if I change in the following line:
> > https://github.com/torvalds/linux/blob/bb1a1146467ad812bb65440696df0782e2bc63c8/drivers/xen/grant-table.c#L1313
> > change from page_count(item->pages[pc]) > 1 to page_count(item->pages[pc]) 
> > > 2 - everything works fine.
> > The obvious way for fix this issue I see is to make the expected page_count
> > for __gnttab_unmap_refs_async configurable for each buffer, but I'm now sure
> > if this is the
> > best solution.
> > 
> > I would be happy to hear your thoughts and advises about how to fix this 
> > situation.
> 
> My first thought would be to save the page_count() of each page when doing
> the map operation, and then compare to that value.
> 
> The natural place to store this count would be struct xen_page_foreign,
> but there are only 16 bits free for the 64-bit system case (it is using
> the struct page->private field for that purpose), so you'd need to bail
> out in case page_count() is > 65535.
> 
> 
> Juergen

Hello Juergen,

Thank you for the advise.
I think I'll use this approach if everybody fine with it.

Best regards,
Oleksii.


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.