[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping
Hello Julien, > On 14 Feb 2021, at 2:32 pm, Julien Grall <julien@xxxxxxx> wrote: > > Hi Rahul, > > On 11/02/2021 16:05, Rahul Singh wrote: >>> On 11 Feb 2021, at 1:52 pm, Julien Grall <julien@xxxxxxx> wrote: >>> >>> >>> >>> On 11/02/2021 13:20, Rahul Singh wrote: >>>> Hello Julien, >>> >>> Hi Rahul, >>> >>>>> On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xxxxxxx> wrote: >>>>> >>>>> >>>>> >>>>> On 10/02/2021 18:08, Rahul Singh wrote: >>>>>> Hello Julien, >>>>>>> On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xxxxxxx> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> On 10/02/2021 15:06, Rahul Singh wrote: >>>>>>>>> On 9 Feb 2021, at 8:36 pm, Stefano Stabellini >>>>>>>>> <sstabellini@xxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>> On Tue, 9 Feb 2021, Rahul Singh wrote: >>>>>>>>>>> On 8 Feb 2021, at 6:49 pm, Stefano Stabellini >>>>>>>>>>> <sstabellini@xxxxxxxxxx> wrote: >>>>>>>>>>> >>>>>>>>>>> Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM. >>>>>>>>>>> The offending chunk is: >>>>>>>>>>> >>>>>>>>>>> #define gnttab_need_iommu_mapping(d) \ >>>>>>>>>>> - (is_domain_direct_mapped(d) && need_iommu(d)) >>>>>>>>>>> + (is_domain_direct_mapped(d) && need_iommu_pt_sync(d)) >>>>>>>>>>> >>>>>>>>>>> On ARM we need gnttab_need_iommu_mapping to be true for dom0 when >>>>>>>>>>> it is >>>>>>>>>>> directly mapped and IOMMU is enabled for the domain, like the old >>>>>>>>>>> check >>>>>>>>>>> did, but the new check is always false. >>>>>>>>>>> >>>>>>>>>>> In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync >>>>>>>>>>> and >>>>>>>>>>> need_sync is set as: >>>>>>>>>>> >>>>>>>>>>> if ( !is_hardware_domain(d) || iommu_hwdom_strict ) >>>>>>>>>>> hd->need_sync = !iommu_use_hap_pt(d); >>>>>>>>>>> >>>>>>>>>>> iommu_use_hap_pt(d) means that the page-table used by the IOMMU is >>>>>>>>>>> the >>>>>>>>>>> P2M. It is true on ARM. need_sync means that you have a separate >>>>>>>>>>> IOMMU >>>>>>>>>>> page-table and it needs to be updated for every change. need_sync >>>>>>>>>>> is set >>>>>>>>>>> to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too, >>>>>>>>>>> which is wrong. >>>>>>>>>>> >>>>>>>>>>> As a consequence, when using PV network from a domU on a system >>>>>>>>>>> where >>>>>>>>>>> IOMMU is on from Dom0, I get: >>>>>>>>>>> >>>>>>>>>>> (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, >>>>>>>>>>> iova=0x8424cb148, fsynr=0xb0001, cb=0 >>>>>>>>>>> [ 68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP >>>>>>>>>>> not OK >>>>>>>>>>> >>>>>>>>>>> The fix is to go back to something along the lines of the old >>>>>>>>>>> implementation of gnttab_need_iommu_mapping. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx> >>>>>>>>>>> Fixes: 91d4eca7add >>>>>>>>>>> Backport: 4.12+ >>>>>>>>>>> >>>>>>>>>>> --- >>>>>>>>>>> >>>>>>>>>>> Given the severity of the bug, I would like to request this patch >>>>>>>>>>> to be >>>>>>>>>>> backported to 4.12 too, even if 4.12 is security-fixes only since >>>>>>>>>>> Oct >>>>>>>>>>> 2020. >>>>>>>>>>> >>>>>>>>>>> For the 4.12 backport, we can use iommu_enabled() instead of >>>>>>>>>>> is_iommu_enabled() in the implementation of >>>>>>>>>>> gnttab_need_iommu_mapping. >>>>>>>>>>> >>>>>>>>>>> Changes in v2: >>>>>>>>>>> - improve commit message >>>>>>>>>>> - add is_iommu_enabled(d) to the check >>>>>>>>>>> --- >>>>>>>>>>> xen/include/asm-arm/grant_table.h | 2 +- >>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/xen/include/asm-arm/grant_table.h >>>>>>>>>>> b/xen/include/asm-arm/grant_table.h >>>>>>>>>>> index 6f585b1538..0ce77f9a1c 100644 >>>>>>>>>>> --- a/xen/include/asm-arm/grant_table.h >>>>>>>>>>> +++ b/xen/include/asm-arm/grant_table.h >>>>>>>>>>> @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned long >>>>>>>>>>> gpaddr, mfn_t mfn, >>>>>>>>>>> (((i) >= nr_status_frames(t)) ? INVALID_GFN : >>>>>>>>>>> (t)->arch.status_gfn[i]) >>>>>>>>>>> >>>>>>>>>>> #define gnttab_need_iommu_mapping(d) \ >>>>>>>>>>> - (is_domain_direct_mapped(d) && need_iommu_pt_sync(d)) >>>>>>>>>>> + (is_domain_direct_mapped(d) && is_iommu_enabled(d)) >>>>>>>>>>> >>>>>>>>>>> #endif /* __ASM_GRANT_TABLE_H__ */ >>>>>>>>>> >>>>>>>>>> I tested the patch and while creating the guest I observed the below >>>>>>>>>> warning from Linux for block device. >>>>>>>>>> https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258 >>>>>>>>> >>>>>>>>> So you are creating a guest with "xl create" in dom0 and you see the >>>>>>>>> warnings below printed by the Dom0 kernel? I imagine the domU has a >>>>>>>>> virtual "disk" of some sort. >>>>>>>> Yes you are right I am trying to create the guest with "xl create” and >>>>>>>> before that, I created the logical volume and trying to attach the >>>>>>>> logical volume >>>>>>>> block to the domain with “xl block-attach”. I observed this error with >>>>>>>> the "xl block-attach” command. >>>>>>>> This issue occurs after applying this patch as what I observed this >>>>>>>> patch introduce the calls to iommu_legacy_{, un}map() to map the grant >>>>>>>> pages for >>>>>>>> IOMMU that touches the page-tables. I am not sure but what I observed >>>>>>>> is that something is written wrong when iomm_unmap calls unmap the >>>>>>>> pages >>>>>>>> because of that issue is observed. >>>>>>> >>>>>>> Can you clarify what you mean by "written wrong"? What sort of error do >>>>>>> you see in the iommu_unmap()? >>>>>> I might be wrong as per my understanding for ARM we are sharing the P2M >>>>>> between CPU and IOMMU always and the map_grant_ref() function is written >>>>>> in such a way that we have to call iommu_legacy_{, un}map() only if P2M >>>>>> is not shared. >>>>> >>>>> map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping() >>>>> returns true. I don't really see where this is assuming the P2M is not >>>>> shared. >>>>> >>>>> In fact, on x86, this will always be false for HVM domain (they support >>>>> both shared and separate page-tables). >>>>> >>>>>> As we are sharing the P2M when we call the iommu_map() function it will >>>>>> overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN) entry >>>>>> and when we call iommu_unmap() it will unmap the (GFN -> MFN ) entry >>>>>> from the page-table. >>>>> AFAIK, there should be nothing mapped at that GFN because the page >>>>> belongs to the guest. At worse, we would overwrite a mapping that is the >>>>> same. >>>>> Sorry I should have mention before backend/frontend is dom0 in this >>> case and GFN is mapped. I am trying to attach the block device to DOM0 >>> >>> Ah, your log makes a lot more sense now. Thank you for the clarification! >>> >>> So yes, I agree that iommu_{,un}map() will do the wrong thing if the >>> frontend and backend in the same domain. >>> >>> I don't know what the state in Linux, but from Xen PoV it should be >>> possible to have the backend/frontend in the same domain. >>> >>> I think we want to ignore the IOMMU mapping request when the domain is the >>> same. Can you try this small untested patch: >> I tested the patch and it is working fine for both dom0/domU. I am able to >> attach the block device to dom0/domu. >> Also I didn’t observe the IOMMU fault also for block device that we have >> behind IOMMU on our system and attached to domU. > > Thanks for the testing. I noticed that my patch doesn't build because > arm_iommu_unmap_page() doesn't have a parameter mfn. Can you confirm whether > you had to replace mfn with _mfn(dfn_x(dfn))? Yes I have to replace the mfn with _mfn(dfn_x(dfn)) to test the patch. Regards, Rahul > > Cheers, > > -- > Julien Grall >
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |