[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/2] VT-d: avoid NULL deref on domain_context_mapping_one() error paths
On 06.04.2022 17:01, Roger Pau Monné wrote: > On Wed, Apr 06, 2022 at 04:07:24PM +0200, Jan Beulich wrote: >> On 06.04.2022 15:36, Roger Pau Monné wrote: >>> On Wed, Apr 06, 2022 at 02:24:32PM +0200, Jan Beulich wrote: >>>> First there's a printk() which actually wrongly uses pdev in the first >>>> place: We want to log the coordinates of the (perhaps fake) device >>>> acted upon, which may not be pdev. >>>> >>>> Then it was quite pointless for eb19326a328d ("VT-d: prepare for per- >>>> device quarantine page tables (part I)") to add a domid_t parameter to >>>> domain_context_unmap_one(): It's only used to pass back here via >>>> me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter again. >>>> >>>> Finally there's the invocation of domain_context_mapping_one(), which >>>> needs to be passed the correct domain ID. Avoid taking that path when >>>> pdev is NULL and the quarantine state is what would need restoring to. >>>> This means we can't security-support PCI devices with RMRRs (if such >>>> exist in practice) any longer. >>>> >>>> Fixes: 8f41e481b485 ("VT-d: re-assign devices directly") >>>> Fixes: 14dd241aad8a ("IOMMU/x86: use per-device page tables for >>>> quarantining") >>>> Coverity ID: 1503784 >>>> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >>>> >>>> --- a/SUPPORT.md >>>> +++ b/SUPPORT.md >>>> @@ -750,6 +750,10 @@ However, this feature can still confer s >>>> when used to remove drivers and backends from domain 0 >>>> (i.e., Driver Domains). >>>> >>>> +On VT-d (Intel hardware) passing through plain PCI (or PCI-X) devices >>>> +when they have associated Reserved Memory Regions (RMRRs) >>>> +is not security supported, if such a combination exists in the first >>>> place. >>> >>> Hm, I think this could be confusing from a user PoV. It's my >>> understanding you want to differentiate between DEV_TYPE_PCIe_ENDPOINT >>> and DEV_TYPE_PCI device types, so maybe we could use: >>> >>> "On VT-d (Intel hardware) passing through non PCI Express devices with >>> associated Reserved Memory Regions (RMRRs) is not supported." >>> >>> AFAICT domain_context_mapping will already prevent this from happening >>> by returning -EOPNOTSUPP (see the DEV_TYPE_PCI case handling). >> >> Hmm. I did look at that code while writing the patch, but I didn't >> draw the same conclusion. I'd like to tie the security support >> statement to what could technically be made work. IOW I don't like >> to say "not supported"; I'd like to stick to "not security >> supported", which won't change even if that -EOPNOTSUPP path would >> be replaced by a proper implementation. > > My preference for using 'not supported' was simply so that users don't > need to worry whether their use-case fails in this category: Xen will > simply reject the operation in the first place. > > Otherwise users might wonder whether some of the devices they are > passing through are security supported or not (lacking proper > information about how to check whether a device is PCI, PCI-X or PCIe > and whether it has associated RMRR regions). > > I understand your worry here, but I do think we should aim to make > this document as easy to parse as possible for users, and hence I > wonder whether your proposed addition will cause unneeded confusion > because that use-case is not allowed by the hypervisor in the first > place. I guess I'll simply drop the SUPPORT.md addition then. It would probably have been a better fit right in one of the XSA-400 patches anyway. >> Even adding a sentence to >> say this doesn't even work at present would seem to me to go too >> far: Such a sentence would easily be forgotten if the situation >> changed. But I'd be willing to add such an auxiliary statement as >> a compromise. >> >> As to "plain PCI (or PCI-X)" vs "non PCI Express" - while I prefer >> to avoid a negation there, I'd be okay to switch if that's deemed >> better for potential readers. > > Maybe it would be best to simply expand the comment before the RMRR > check in domain_context_mapping() to note that removing the check will > have security implications? Hmm, with the changes I'm doing I don't think I make matters worse, so this wouldn't seem to me to belong here. >>>> @@ -1601,9 +1601,13 @@ int domain_context_mapping_one( >>>> >>>> if ( rc ) >>>> { >>>> - if ( !prev_dom ) >>>> - ret = domain_context_unmap_one(domain, iommu, bus, devfn, >>>> - DEVICE_DOMID(domain, pdev)); >>>> + if ( !prev_dom || >>>> + /* >>>> + * Unmapping here means PCI devices with RMRRs (if such >>>> exist) >>>> + * will cause problems if such a region was actually >>>> accessed. >>>> + */ >>>> + (prev_dom == dom_io && !pdev) ) >>> >>> Maybe I'm reading this wrong, but plain PCI devices with RMRRs are >>> only allowed to be assigned to the hardware domain, and won't be able >>> to be reassigned afterwards. It would be fine to unmap >>> unconditionally if !prev_dom or !pdev? As calls with !pdev only >>> happening for phantom functions or bridge devices. >> >> Like with the support statement, I'd prefer this code to be independent >> of the (perhaps temporary) decision to not allow certain assignments. > > I was just saying because it would make the code easier IMO, but maybe > it doesn't matter that much. > > The comment however should also be adjusted to mention that refers to > legacy DEV_TYPE_PCI type devices ('PCI devices with RMRRs' is too > unspecific IMO). I'm happy to use DEV_TYPE_PCI in the comment. >> Since you mention phantom functions: Aiui their mapping operations will >> be done with a non-NULL pdev, unless of course they're phantom functions >> associated with a non-PCIe device (in which case the same secondary >> mappings with a NULL pdev would occur - imo pointlessly, as it would >> be the same bridge and the same secondary bus as for the actual device; >> I'm under the impression that error handling may not work properly when >> such redundant mappings occur). > > The redundant mapping of the bridges would be fine as prev_dom == > domain in that case, and cannot fail? Hmm, yes, good point. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |