[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] arm64: dma_to_phys/phys_to_dma need to be properly implemented



Hi, Stefano

Unfortunately I had to switch to other tasks,

but I'll get back to this issue asap

Thank you


On 03/30/2017 01:36 AM, Stefano Stabellini wrote:
On Wed, 29 Mar 2017, Oleksandr Andrushchenko wrote:
Hi, Stefano!

Ok, probably I need to put more details into the use-case
so it is clear. What I am doing is a DRM driver which
utilizes PRIME buffer sharing [1] to implement zero-copy
of display buffers between DomU and Dom0. PRIME is based on
DMA Buffer Sharing API [2], so this is the reason I am
dealing with sg_table here.

On 03/28/2017 10:20 PM, Stefano Stabellini wrote:
On Tue, 28 Mar 2017, Oleksandr Andrushchenko wrote:
Hi, Stefano!

On 03/27/2017 11:23 PM, Stefano Stabellini wrote:
Hello Oleksandr,

Just to clarify, you are talking about dma_to_phys/phys_to_dma in Linux
(not in Xen), right?
I am talking about Linux, sorry I was not clear
Drivers shouldn't use those functions directly (see the comment in
arch/arm64/include/asm/dma-mapping.h), they should call the appropriate
dma_map_ops functions instead.
Yes, you are correct and I know about this and do not call
dma_to_phys/phys_to_dma directly
    The dma_map_ops functions should do the
right thing on Xen, even for pages where pfn != mfn, thanks to the
swiotlb-xen (drivers/xen/swiotlb-xen.c). Specifically, you can see the
special case for pfn != mfn here (see the "local" variable):

arch/arm/include/asm/xen/page-coherent.h:xen_dma_map_page
Yes, the scenarios with pfn != mfn we had so
far are all working correct
So, why are you calling dma_to_phys and phys_to_dma instead of the
dma_map_ops functions?
Ok, let me give you an example of failing scenario which
was not used before this by any backend (this is from
my work on implementing zero copy for DRM drivers):

1. Create sg table from pages:
sg_alloc_table_from_pages(sgt, PFN_0, ...);
map it and get dev_bus_addr - at this stage sg table is
perfectly correct and properly mapped,
dev_bus_addr == (MFN_u << PAGE_SHIFT)
Let me get this straight: one of the pages passed to
sg_alloc_table_from_pages is actually a foreign page (pfn != mfn), is
that right?

And by "map", you mean dma_get_sgtable_attrs is called on it, right?
What happening here is:
------------- my driver ------------
1. I create an sg table from pages with pfn != mfn (PFN_0/MFN_u)
using drm_prime_pages_to_sg [3] which effectively is
sg_alloc_table_from_pages
------------- DRM framework ------------
2. I pass the sgt via PRIME to the real display driver
and it does drm_gem_map_dma_buf [4]
3. So, at this point everyting is just fine, because sgt is
correct (sgl->page_link points to my PFN_0 and p2m translation
succeeds)
------------- real HW DRM driver ------------
4. When real HW display driver accesses sgt it calls dma_get_sgtable
[5] and then dma_map_sg [6]. And all this is happening on the sgt
which my driver has provided, but PFN_0 is not honored anymore
because dma_get_sgtable is expected to be able to figure out
pfn from the corresponding DMA address.

So, strictly speaking real HW DRM driver has no issues,
the API it uses is perfectly valid.
2. Duplicate it:
dma_get_sgtable(..., sgt, ... dev_bus_addr,...);
Yeah, if one of the pages passed to sg_alloc_table_from_pages is
foreign, as Andrii pointed out, dma_get_sgtable
(xen_swiotlb_get_sgtable) doesn't actually work.
This is the case
Is it legitimate that one of those pages is foreign or is it a mistake?
This is the goal - I want pages from DomU to be directly
accessed by the HW in Dom0 (I have DomU 1:1 mapped,
so even contiguous buffers can come from DomU, if not
1:1 then IPMMU will be in place)
If it is a mistake, you could fix it.
 From the above - this is the intention
   Otherwise, if the use of
sg_alloc_table_from_pages or the following call to dma_get_sgtable are
part of your code, I suggest you work-around the problem by avoiding
the dma_get_sgtable call altogether.
As seen from the above the problematic part is not in my
driver, it is either DRM framework or HW display driver
   Don't use the sg_ dma api, use the
regular dma api instead.
I use what DRM provides and dma_xxx if something is missed

Unfortunately, if the dma_get_sgtable is part of existing code, then we
have a problem. In that case, could you point me to the code that call
dma_get_sgtable?
This is the case, see [5]
There is no easy way to make it work on Xen: virt_to_phys doesn't work
on ARM and dma_to_phys doesn't work on Xen. We could implement
xen_swiotlb_get_sgtable correctly if we had the physical address of the
page, because we could easily find out if the page is local or foreign
with a pfn != mfn check (similar to the one in
include/xen/arm/page-coherent.h:xen_dma_map_page).
Yes, I saw this code and it helped me to figure out
where the use-case fails
If the page is local, then we would do what we do today. If the page is
foreign, then we would need to do something like:

      int ret = sg_alloc_table(sgt, 1, GFP_KERNEL);

        if (!ret) {
                sg_set_page(sgt->sgl, phys_to_page(phys), PAGE_ALIGN(size),
0);
                sgt->sgl->dma_address = dev_addr;
        }
Agree, the crucial part here phys_to_page(phys): we need mfn->pfn
Now the question is, how do we get the pseudo-physical address of the
page?
yes, this is the root of the problem
We could parse the stage1 page table entry of the kernel, or we
could use the "at" instruction (see for example gva_to_ipa_par in xen).
It is a bit rough, but I cannot think of anything else.
Me neither, this is why I hope community will help here...
Otherwise I'll need to patch kernel drivers if it's still possible.
If you can come up with a patch that only affects
xen_swiotlb_get_sgtable, and translates successfully void *cpu_addr into
a physical address using "at", I think I would take that patch. I would
recommend to test the patch on ARM32 too, where virt_to_phys reliably
fails.

I don't think we can ask the community to drop dma_get_sgtable on the
basis that the DMA api is broken.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.