[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: IOMMU problem on Xen dom0 arm (Was: Re: Xen on arm Chromebook seems to cause no display on screen)



(+ Bertrand, Stefano)

Hi Chuck,

Thanks for the report.

On 26/10/2023 17:17, Chuck Zmudzinski wrote:
On 10/25/2023 10:44 PM, Chuck Zmudzinski wrote:
We also have not yet done a thorough analysis of the differences
in the kernel boot logs when booting on the bare metal vs. booting
as dom0 on Xen, but nothing stood out in the logs as an obvious
cause of this problem after a quick look at the logs.

After a more careful look at the logs, this seems to be the error
causing no display when booting as dom0 on Xen:

*ERROR* Device 14450000.mixer lacks support for IOMMU

A little more context from the logs follows (I did not word wrap
the log messages to 72 characters because they are easier to read
without word wrapping).

On bare metal:

1999-12-31T20:03:21.728453-05:00 devuan-bunsen kernel: [    2.535938] [drm] 
Exynos DRM: using 14400000.fimd device for DMA mapping operations
1999-12-31T20:03:21.728461-05:00 devuan-bunsen kernel: [    2.536139] 
exynos-drm exynos-drm: bound 14400000.fimd (ops 0xc0d96354)
1999-12-31T20:03:21.728471-05:00 devuan-bunsen kernel: [    2.536274] 
exynos-drm exynos-drm: bound 14450000.mixer (ops 0xc0d97554)
1999-12-31T20:03:21.728480-05:00 devuan-bunsen kernel: [    2.536493] 
exynos-drm exynos-drm: bound 145b0000.dp-controller (ops 0xc0d97278)
1999-12-31T20:03:21.728491-05:00 devuan-bunsen kernel: [    2.536520] 
exynos-drm exynos-drm: bound 14530000.hdmi (ops 0xc0d97bd0)
...
1999-12-31T20:03:21.729272-05:00 devuan-bunsen kernel: [    3.493686] Console: 
switching to colour frame buffer device 170x48
1999-12-31T20:03:21.729282-05:00 devuan-bunsen kernel: [    3.521747] 
exynos-drm exynos-drm: [drm] fb0: exynosdrmfb frame buffer device
1999-12-31T20:03:21.729292-05:00 devuan-bunsen kernel: [    3.522831] [drm] 
Initialized exynos 1.1.0 20180330 for exynos-drm on minor 0

The screen works normally in this case.

On Xen as dom0:

1999-12-31T20:01:09.722790-05:00 devuan-bunsen kernel: [    2.606812] [drm] 
Exynos DRM: using 14400000.fimd device for DMA mapping operations
1999-12-31T20:01:09.722795-05:00 devuan-bunsen kernel: [    2.606884] 
exynos-drm exynos-drm: bound 14400000.fimd (ops 0xc0d96354)
1999-12-31T20:01:09.722800-05:00 devuan-bunsen kernel: [    2.606999] 
exynos-mixer 14450000.mixer: [drm:exynos_drm_register_dma] *ERROR* Device 
14450000.mixer lacks support for IOMMU
1999-12-31T20:01:09.722805-05:00 devuan-bunsen kernel: [    2.607044] 
exynos-drm exynos-drm: failed to bind 14450000.mixer (ops 0xc0d97554): -22
1999-12-31T20:01:09.722810-05:00 devuan-bunsen kernel: [    2.607162] 
exynos-drm exynos-drm: adev bind failed: -22
1999-12-31T20:01:09.722815-05:00 devuan-bunsen kernel: [    2.607183] 
exynos-dp: probe of 145b0000.dp-controller failed with error -22

There is no display on the screen in this case. The backlight
does not even come on.

So the error causing no display is probably:

*ERROR* Device 14450000.mixer lacks support for IOMMU

I am new to arm virtualization with Xen. I understand IOMMU on x86
is needed for PCI passthrough to domU guests, but not for dom0 to
use such devices.

I believe that the IOMMU would be required on x86 when using dom0 PVH. PVH is very similar to an Arm guests.

On Arm, we don't require the IOMMU because not all Arm platforms have all DMA-capable devices protected by an IOMMU. So dom0 will still have its memory direct mapped (i.e. host physical address = guest physical address) to allow DMA in dom0 with limited modification.

That said, I thik this is a different situation here (see below).


So on arm, why is dom0 trying to use IOMMU for
the exynos-mixer/exynos-drm when bare metal does not use it?

Just to confirm, are you using the same kernel, same config when booting on baremetal? If so, from looking at the code, I would expect that the IOMMU is also used on baremetal.

The check failing is:

if (get_dma_ops(priv->dma_dev) != get_dma_ops(subdrv_dev))

I am not quite too sure why the check implies the IOMMU is not supported. That said, I vaguely recall that Linux will update the DMA ops when running under Xen. Would you be able to print the two values returned ("%pS" should give the symbol)?

Anyway, letting dom0 to use the IOMMU is probably a bad idea as even if dom0 memory is direct mapped, grant mappings are not. So you would end up to see random crashes.

Right now, if Xen doesn't use the IOMMU (e.g. because it was disabled or there is no driver), then the device will be assigned to dom0. We recently had some discussion to hide the IOMMU from dom0. I expect a patch to be on the ML in the near future.

As a temporary hack, would you be able to compile out the IOMMU driver from Linux and check if it helps using the GPU?

Looking at the documentation in the Linux tree, I am under the impresion that the Exynos SMMUs are mainly used to avoid allocating large contiguous buffer. So in the longer run, it might be good to understand the performance impact of hiding them from dom0.


I also noted this difference in the logs:

Bare metal:
1999-12-31T20:03:21.723483-05:00 devuan-bunsen kernel: [    0.000000]   Normal  
 [mem 0x0000000040000000-0x000000006fffffff]

Xen dom0:
1999-12-31T20:01:09.720365-05:00 devuan-bunsen kernel: [    0.000000]   Normal  
 [mem 0x0000000060000000-0x000000008fffffff]

If I am reading those numbers correctly, normal memory starts
at 1 GB on bare metal, but at 1.5 GB as Xen dom0.

Could that be causing dom0 to try to use IOMMU for
exynos-mixer/exynos-drm?


Cheers,

[1] https://android.googlesource.com/kernel/msm/+/android-7.1.0_r0.2/Documentation/devicetree/bindings/iommu/samsung%2Csysmmu.txt

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.