[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: smmuv1 breakage



On Wed, 23 Jun 2021, Rahul Singh wrote:
> Hi Stefano,
> 
> > On 23 Jun 2021, at 9:09 am, Rahul Singh <Rahul.Singh@xxxxxxx> wrote:
> >
> > Hi Stefano,
> >
> >> On 22 Jun 2021, at 10:06 pm, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> >> wrote:
> >>
> >> Hi Rahul,
> >>
> >> Do you have an opinion on how we should move forward on this?
> >>
> >> Do you think it is OK to go for a full revert of "xen/arm: smmuv1:
> >> Intelligent SMR allocation" or do you think it is best to go with an
> >> alternative fix? If so, do you have something in mind?
> >>
> >
> > Sorry for the late reply I was working on another high-priority task.
> > I will work on this will try to fix the issue. I will update you within 2-3 
> > days.
> 
> I again checked my patches and found out that while allocating SMR I by 
> mistake
> allocated one SMR for each SMMU device but we have to allocate the number of
> SMR based on supported stream matching register for each SMMU device.
> 
> This might be causing the issue. As I don’t have any Xilinx hardware and on
> QEMU/Juno issue is not reproducible.Can you please test the attached patch and
> let me know if it works.

Yes this solves the issue for me, thank you!!


Acked-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
Tested-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>


> >
> > Regards,
> > Rahul
> >
> >>
> >>
> >> On Tue, 15 Jun 2021, Stefano Stabellini wrote:
> >>> On Tue, 15 Jun 2021, Rahul Singh wrote:
> >>>> Hi Stefano
> >>>>
> >>>>> On 15 Jun 2021, at 3:21 am, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> >>>>> wrote:
> >>>>>
> >>>>> Hi Rahul,
> >>>>>
> >>>>> Unfortunately, after bisecting, I discovered a few more breakages due to
> >>>>> your smmuv1 series (commits e889809b .. 3e6047ddf) on Xilinx ZynqMP. I
> >>>>> attached the DTB as reference. Please note that I made sure to
> >>>>> cherry-pick "xen/arm: smmuv1: Revert associating the group pointer with
> >>>>> the S2CR" during bisection. So the errors are present also on staging.
> >>>>>
> >>>>> The first breakage is an error at boot time in smmu.c#find_smmu_master,
> >>>>> see log1. I think it is due to the lack of ability to parse the new smmu
> >>>>> bindings in the old smmu driver.
> >>>>>
> >>>>> After removing all the "smmus" and "#stream-id-cells" properties in
> >>>>> device tree, I get past the previous error, everything seems to be OK at
> >>>>> early boot, but I actually get SMMU errors as soon as dom0 starting
> >>>>> using devices:
> >>>>>
> >>>>> (XEN) smmu: /smmu@fd800000: Unexpected global fault, this could be 
> >>>>> serious
> >>>>> (XEN) smmu: /smmu@fd800000:     GFSR 0x80000002, GFSYNR0 0x00000000, 
> >>>>> GFSYNR1 0x00000877, GFSYNR2 0x00000000
> >>>>
> >>>> This fault is "Unidentified stream fault” for StreamID “ 0x877” that 
> >>>> means SMMU SMR is not configured for streamID “0x877"
> >>>>
> >>>>
> >>>>> [   10.419681] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> >>>>> [   10.426452] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> >>>>>
> >>>>> Do you think you'll be able to help fix them?
> >>>>>
> >>>>>
> >>>>> You should be able to reproduce the two issues using Xilinx QEMU (but to
> >>>>> be honest I haven't tested it on QEMU yet, I was testing on real
> >>>>> hardware):
> >>>>> - clone and compile xilinx QEMU https://github.com/Xilinx/qemu.git
> >>>>> ./configure  --target-list=aarch64-softmmu
> >>>>> make
> >>>>> - clone and build git://github.com/Xilinx/qemu-devicetrees.git
> >>>>> - use the attached script to run it
> >>>>>  - kernel can be upstream defconfig 5.10
> >>>>>
> >>>>
> >>>> I tried to reproduce the issue on Xilinx QEMU as per the steps shared 
> >>>> above
> >>>> but I am not observing any issue on Xilinx QEMU.
> >>>
> >>> I tried on QEMU and it doesn't repro. I cannot explain why it works on
> >>> QEMU and it fails on real hardware.
> >>>
> >>>
> >>>> I also tested and confirmed on QEMU that SMMU is configured correctly
> >>>> for specifically StreamID “ 0x877” and for other streamIDs.
> >>>>
> >>>> I check the xen.dtb shared by you and found out the there is no 
> >>>> "stream-id-cells”
> >>>> property in the master device but the "mmu-masters" property is present 
> >>>> in the
> >>>> smmu node. For legacy smmu binding we need both "stream-id-cells” and 
> >>>> "mmu-masters”.
> >>>> If you need to add the new smmu binding please add the "iommu-cells”
> >>>> property in the smmu node and the “iommus” property in the master device.
> >>>
> >>> In regards to the missing "stream-id-cells" property, I shared the wrong
> >>> dtb before, sorry. I was running a number of tests and I might have
> >>> picked the wrong file. The proper dtb comes with "stream-id-cells" for
> >>> the 0x877 device, see attached.
> >>>
> >>>
> >>>
> >>>> Can you please share the xen boot logs with me so that I can debug 
> >>>> further why the error is observed?
> >>>
> >>> See attached. I did some debugging and discovered that it crashes while
> >>> accessing master->of_node in find_smmu_master. If I revert your series,
> >>> the crash goes away. It is very strange because your patches don't touch
> >>> find_smmu_master or insert_smmu_master directly.
> >>>
> >>> I did a git reset --hard on the commit "xen/arm: smmuv1: Add a stream
> >>> map entry iterator" and it worked, which points to "xen/arm: smmuv1:
> >>> Intelligent SMR allocation" being the problem, even if I have the revert
> >>> cherry-picked on top. Maybe the revert is not reverting enough?
> >>>
> >>> After this test, I switched back to staging and did:
> >>> git revert 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a
> >>> git revert 0435784cc75dcfef3b5f59c29deb1dbb84265ddb
> >>>
> >>> And it worked! So the issue truly is that
> >>> 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a doesn't revert "enough".
> >>> See "full-revert" for the patch reverting the remaining code. That on
> >>> top of staging fixes boot for me.
> >
> 
> 
> 

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.