Hi Roman
On 01/09/2021 10:59, Roman Skakun wrote:
>> If you have a setup where Dom0 is not 1:1 mapped (which is not currently
>> possible with upstream Xen but it is possible with cache coloring) and
>> uses the IOMMU to make device DMA work like regular DomUs, then passing
>> XENFEAT_not_direct_mapped to Linux would make it work. Without
>> XENFEAT_not_direct_mapped, Linux would try to use swiotlb-xen which has
>> code that relies on Linux being 1:1 mapped to work properly.
>
> I'm using 1:1 Dom0.
> According to your patch series, xen-swiotlb fops will be applied for all
> devices
> because XENFEAT_direct_mapped is active, as shown here:
>
https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14/source/arch/arm64/mm/dma-mapping.c*L56__;Iw!!GF_29dbcQIUBPA!i7I0DxCbP4ibLDwzRkeFkgRQbKh-fVD9McLqabG1TzZs9smOVBeowPS_Iv_mvn3O$ [elixir[.]bootlin[.]com]
> <
https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14/source/arch/arm64/mm/dma-mapping.c*L56__;Iw!!GF_29dbcQIUBPA!i7I0DxCbP4ibLDwzRkeFkgRQbKh-fVD9McLqabG1TzZs9smOVBeowPS_Iv_mvn3O$ [elixir[.]bootlin[.]com]>
>
> I agreed, that xen-swiotlb should work correctly, but in my case, I
> retrieved MFN here:
>
https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14/source/drivers/xen/swiotlb-xen.c*L366__;Iw!!GF_29dbcQIUBPA!i7I0DxCbP4ibLDwzRkeFkgRQbKh-fVD9McLqabG1TzZs9smOVBeowPS_IgZgXPjC$ [elixir[.]bootlin[.]com]
> <
https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14/source/drivers/xen/swiotlb-xen.c*L366__;Iw!!GF_29dbcQIUBPA!i7I0DxCbP4ibLDwzRkeFkgRQbKh-fVD9McLqabG1TzZs9smOVBeowPS_IgZgXPjC$ [elixir[.]bootlin[.]com]>
> which is greater than 32bit and xen-swiotlb tries to use bounce buffer
> as expected.
> It can lead to decrease a performance because I have a long buffer ~4MB.
>
> I thought, that we can disable swiotlb fops for devices which are
> controlled by IOMMU.
Yes you can disable swiotlb for devices which are controlled by the
IOMMU. But this will not make your problem disappear, it simply hides
until it bites you in a more subttle way.
From what you wrote, you have a 32-bit DMA capable. So you always need
to have an address below 4GB. For foreign mapping, there is no guarantee
the Guest Physical Address will actually be below 4GB.
Today, the ballooning code only ask Linux to steal *a* RAM page for
mapping the foreign page. This may or may not be below 4GB depending on
how you assigned the RAM to dom0 (IIRC you had some RAM above 4GB).
But that's the current behavior. One of your work colleague is looking
at avoid to steal RAM page to avoid exhausting the memory. So foreign
mapping may end up to be a lot higher in memory.
IOW, you will need to be able to bounce the DMA buffer for your device.
If you want to avoid bouncing, the proper way would be to rework the
ballonning code so pages are allocated below 4GB.
Cheers,
--
Julien Grall