[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: privcmd.c not calling set_phys_to_machine



On Mon, 17 Oct 2022, Juergen Gross wrote:
> On 14.10.22 23:04, Stefano Stabellini wrote:
> > Hi Juergen and all,
> > 
> > I am writing again to ask a question about privcmd.c in PV dom0 x86.
> > This is related to the previous pin_user_pages_fast issue:
> > 
> > https://marc.info/?l=xen-devel&m=166268914727630
> > https://marc.info/?l=xen-devel&m=166322385912052
> > 
> > 
> > In summary this is the situation:
> > 
> > 1. domU (HVM) kernel space:
> >      a. pages allocation with get_free_pages()
> >      b. get dma_handle by calling dma_map_page() on the pages allocated in
> > (1.a)
> >      c. send dma_handle to dom0 (PV) using virtio queue
> > 
> > 2. dom0 userspace (QEMU):
> >          a. read dma_handle from virtio queue
> >          b. map dma_handle using QEMU dma_memory_map(), which calls
> >             xenforeignmemory_map2, which is IOCTL_PRIVCMD_MMAPBATCH_V2,
> >             which ends up calling
> > drivers/xen/privcmd.c:privcmd_ioctl_mmap_batch
> >             [this is verified to work correctly, the mapping works]
> >          c. open /dev/tee node and make an ioctl call to register the
> >             virtual address (from step 2.b) with TEE.
> > 
> > 3. dom0 kernel space:
> >          a. AMD TEE driver get the virtual address passed by userspace
> >          b. AMD TEE driver get the list of pages corresponding to the
> >             virtual address (3.a) and calls dma_map_page() on them
> 
> I'm rather sure "AMD TEE driver get the list of pages corresponding to the
> virtual address" is the problem. The PTEs should have the "special" flag
> set, meaning that there is no struct page associated with this virtual area.
> 
> > The last step (3.b) misbehaves as dev_addr at the beginning of
> > xen_swiotlb_map_page (which implements dma_map_page() in dom)) is 0.
> > 
> >    dma_addr_t dev_addr = xen_phys_to_dma(dev, phys);
> >    /* dev_addr here is zero */
> > 
> > 
> > Could it be that the original mapping of the foreign pages in Dom0, done
> > by step 2.b, is not complete? Looking into
> > privcmd_ioctl_mmap_batch, for PV guests, it is calling mmap_batch_fn:
> > 
> >     BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
> >                                 &pagelist, mmap_batch_fn, &state));
> > 
> > mmap_batch_fn calls xen_remap_domain_gfn_array, which calls
> > xen_remap_pfn.
> > 
> > xen_remap_pfn only changes the VA->PA mapping and does nothing else.
> > Specifically, nobody seems to call set_phys_to_machine in this code
> > path. Isn't set_phys_to_machine required?
> 
> Not for special memory pages.
> 
> > Don't we need a call to set_phys_to_machine so that the next time a
> > driver tries to call:
> > 
> >    /* address is the virtual address passed by QEMU userspace */
> >    dma_map_page(virt_to_page(address))
> > 
> > it will behave correctly? Or am I missing something?
> > 
> > 
> > How is xen_phys_to_dma expected to work correctly for:
> > 
> >    /* address is the virtual address passed by QEMU userspace and mapped
> >     * in 2.b */
> >    phys_addr = virt_to_phys(address);
> >    xen_phys_to_dma(dev, phys_addr);
> > 
> > 
> > My guess would be that we need to add:
> > 
> >    set_phys_to_machine(pfn, FOREIGN_FRAME(mfn));
> > 
> > in mmap_batch_fn or xen_remap_pfn?
> 
> I think this might be a little bit more complicated.
> 
> This could work, if there is really a struct page available for the PFN.
> OTOH this might be not the case quite often, as we are using zone device
> memory for foreign mappings per default for some time now.
> 
> Solving this might require something like dma_map_pfn() instead of
> dma_map_page(), which sounds a little bit like dma_direct_mmap().

It is actually dma_mmap_attrs and looking at its description it would
have to be step 2.b to call dma_mmap_attrs instead of
xen_remap_domain_gfn_array? Also, we would need an implementation of
.mmap in xen_swiotlb_dma_ops.


I think that's fine but I am not clear on how to implement
xen_swiotlb_dma_ops.mmap for PV guests. I can imagine that on HVM/PVH it
would just be similar to xen_xlate_remap_gfn_array. How do you see it
implemented for PV?




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.