[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.18 v2] iommu/vt-d: fix SAGAW capability parsing
On Thu, Oct 19, 2023 at 10:49:29AM +0200, Jan Beulich wrote: > On 19.10.2023 10:15, Roger Pau Monné wrote: > > On Thu, Oct 19, 2023 at 09:41:41AM +0200, Jan Beulich wrote: > >> On 18.10.2023 18:07, Roger Pau Monne wrote: > >>> SAGAW is a bitmap field, with bits 1, 2 and 3 signaling support for 3, 4 > >>> and 5 > >>> level page tables respectively. According to the Intel VT-d > >>> specification, an > >>> IOMMU can report multiple SAGAW bits being set. > >>> > >>> Commit 859d11b27912 claims to replace the open-coded > >>> find_first_set_bit(), but > >>> it's actually replacing an open coded implementation to find the last set > >>> bit. > >>> The change forces the used AGAW to the lowest supported by the IOMMU > >>> instead of > >>> the highest one between 1 and 2. > >>> > >>> Restore the previous SAGAW parsing by using fls() instead of > >>> find_first_set_bit(), in order to get the highest (supported) AGAW to be > >>> used. > >>> > >>> However there's a caveat related to the value the AW context entry field > >>> must > >>> be set to when using passthrough mode: > >>> > >>> "When the Translation-type (TT) field indicates pass-through processing > >>> (10b), > >>> this field must be programmed to indicate the largest AGAW value > >>> supported by > >>> hardware." [0] > >>> > >>> Newer Intel IOMMU implementations support 5 level page tables for the > >>> IOMMU, > >>> and signal such support in SAGAW bit 3. > >>> > >>> Enabling 5 level paging support (AGAW 3) at this point in the release is > >>> too > >>> risky, so instead put a bodge to unconditionally disable passthough mode > >>> if > >>> SAGAW has any bits greater than 2 set. Ignore bit 0, it's reserved in the > >>> specification but unlikely to have any meaning in the future. > >> > >> May be worth mentioning that in earlier versions this indicated 2-level > >> paging support. > > > > Oh, that's not even present in my copy of the spec from 2016. I guess > > it was removed very, very long time ago? > > Indeed, as mentioned in the commit you're fixing. Version 1.3 of the > spec still has it. Judging by the document numbers 2.2 may have been > its direct successor (i.e. no further 1.x and no 2.0 or 2.1). > > >>> --- a/xen/drivers/passthrough/vtd/iommu.c > >>> +++ b/xen/drivers/passthrough/vtd/iommu.c > >>> @@ -1327,15 +1327,24 @@ int __init iommu_alloc(struct acpi_drhd_unit > >>> *drhd) > >>> > >>> /* Calculate number of pagetable levels: 3 or 4. */ > >>> sagaw = cap_sagaw(iommu->cap); > >>> - if ( sagaw & 6 ) > >>> - agaw = find_first_set_bit(sagaw & 6); > >>> - if ( !agaw ) > >>> + agaw = fls(sagaw & 6) - 1; > >>> + if ( agaw == -1 ) > >> > >> Would you mind making this "< 0" or even "<= 0"? The latter in particular > >> would already cover the likely future change of dropping the masking by 6. > > > > My plan wasn't to drop the masking, but use 0xe if we support AGAW 3. > > But we will also need to deal with bit 4 (at which point applying a mask > is going to be useless code). We can either guess that it's going to mean > 6-level paging, or we need to treat it as having unknown meaning when set > (implying that we'd then still need to either fail initialization or > disable pass-through mode). I wouldn't enable support for AGAW 4 unless we have a way to test it. It's safer to just disable passthrough mode if SAGAW bit 4 is set. > > I'm fine with using < or <= if you think it's more robust. > > Good, will do so then. > > >>> { > >>> printk(XENLOG_ERR VTDPREFIX "IOMMU: unsupported sagaw %x\n", > >>> sagaw); > >>> print_iommu_regs(drhd); > >>> rc = -ENODEV; > >>> goto free; > >>> } > >>> + if ( sagaw >> 3 ) > >>> + { > >>> + printk_once(XENLOG_WARNING VTDPREFIX > >>> + "IOMMU: unhandled bits set in sagaw (%#x)%s\n", > >> > >> I think IOMMU: is redundant with VTDPREFIX (or alternatively iommu->index > >> would also want logging). Also note that VTDPREFIX (bogusly) has no > >> trailing space. (I realize both apply to the other log message in context > >> as well, but still. I'd be inclined to adjust that at the same time, > >> including switching to %#x as you have it in the new log message.) > > > > Oh, I didn't realize VTDPREFIX had no trailing space. > > > > Since it's a printk_once(), not sure iommu->index is really useful > > here, as we would report just one IOMMU has having an unhandled SAGAW. > > IMO if we switch to printing iommu->index we must also use a plain > > printk. But I don't see a lot of benefit in printing this for likely > > each IOMMU on the system, and hence I would rather use printk_once() > > and not print the index. > > Well, logging the index in printk_once() has the benefit of identifying > the first IOMMU with the issue, which may help further analysis if not > all of them have bits beyond 2 set. But I'm not going to insist on this > aspect. > > > Feel free to drop the IOMMU prefix, but I'm not sure what to do with > > VTDPREFIX and the missing trialing space, as some users of VTDPREFIX > > already account for such missing space. > > I'd simply insert a leading space in the string literal. Ack. > >>> + sagaw, > >>> + iommu_hwdom_passthrough ? " disabling passthrough" : > >>> "" ); > >> > >> May want a leading comma (or some other separator) in the string. > >> > >>> + if ( iommu_hwdom_passthrough ) > >>> + iommu_hwdom_passthrough = false; > >> > >> No real need for if() here. > > > > Not really, but also no need for a write to iommu_hwdom_passthrough > > every time an IOMMU is initialized if the condition is removed. > > This is init-time code, and hence the excess writes aren't going to be > noticable. I don't have a strong opinion, TBH I used to have it without the conditional and added it later. If you prefer to drop the conditional I won't oppose. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |