[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] IOMMU: make domctl handler tolerate NULL domain
On 20/04/2022 06:52, Jan Beulich wrote: > On 19.04.2022 18:06, Andrew Cooper wrote: >> On 19/04/2022 16:52, Juergen Gross wrote: >>> On 19.04.22 17:48, Andrew Cooper wrote: >>>> On 19/04/2022 10:39, Jan Beulich wrote: >>>>> Besides the reporter's issue of hitting a NULL deref when >>>>> !CONFIG_GDBSX, >>>>> XEN_DOMCTL_test_assign_device can legitimately end up having NULL >>>>> passed >>>>> here, when the domctl was passed DOMID_INVALID. >>>>> >>>>> Fixes: 71e617a6b8f6 ("use is_iommu_enabled() where appropriate...") >>>>> Reported-by: Cheyenne Wills <cheyenne.wills@xxxxxxxxx> >>>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >>>>> >>>>> --- a/xen/drivers/passthrough/iommu.c >>>>> +++ b/xen/drivers/passthrough/iommu.c >>>>> @@ -558,7 +558,7 @@ int iommu_do_domctl( >>>>> { >>>>> int ret = -ENODEV; >>>>> - if ( !is_iommu_enabled(d) ) >>>>> + if ( !(d ? is_iommu_enabled(d) : iommu_enabled) ) >>>>> return -EOPNOTSUPP; >>>> Having spent the better part of a day debugging this mess, this patch is >>>> plain broken. >>>> >>>> It depends on Juergen's "xen/iommu: cleanup iommu related domctl >>>> handling" patch, because otherwise it erroneously fails non-IOMMU >>>> subops. >>> Which is not a real problem, as it is being called after all other >>> subops didn't match. >> It is a real problem even now, because it is bogus for the host or >> domain's IOMMU status to have any alteration to the >> XEN_DOMCTL_gdbsx_guestmemio path. The root cause of this bug is the >> existing XEN_DOMCTL_gdbsx_guestmemio case being compiled out in the >> !GDBSX case. > I find your wording ("plain broken" in particular) irritating, to put > it mildly. The change in behavior is that -EOPNOTSUPP may now be > returned for the gdbsx operation instead of -ENOSYS. It's not just gdbsx operations - it's every domctl subop whose case statement happens to get conditionally compiled out: XEN_DOMCTL_set_access_required XEN_DOMCTL_debug_op XEN_DOMCTL_mem_sharing_op XEN_DOMCTL_audit_p2m and every future domctl. I didn't trying reasoning about the differing populations of each arches arch_do_domctl(). > And that's when > it would better have been -EOPNOTSUPP in the first place. This irrelevant unless you have a time machine, or you can prove that the change doesn't break things. For the record, I didn't know about Juergen's discovery of 2 ENOSYS vs EOPNOTSUPP breakages in xen.git alone when writing the email. The mass, and spurious, change to almost 2^32 subops was enough to give pause for thought. >> It would be a more obvious problem if there were calls chained after >> iommu_do_domctl() in the arch_domctl() default: blocks, because then it >> wouldn't only be compiled-out functionality which hit this check. > But that's not the case. There is timebomb which just exploded on a user, and you've provided a patch claiming to defuse it, when all you have done is swap out one trigger for another. Specifically, you've replaced a latent bug (nothing actually calls test_assign_device with DOMID_INVALID) with a real error metastability for logic that really does care about ENOSYS vs EOPNOTSUPP. Yes - we should decide whether it ought be legal to call test_assign_device with DOMID_INVALID or not, and then write the reasoning down in the same patch which adjusts do_domctl() and/or iommu_do_domctl() to have consistent behaviour. But until iommu_do_domctl() is filtered to not operate on unrelated subops, making this change breaks more things than it fixes. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |