[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Segment truncation in multi-segment PCI handling?
On Mon, Jun 10, 2024 at 12:11:58PM +0200, Jan Beulich wrote: > On 10.06.2024 11:46, Roger Pau Monné wrote: > > On Mon, Jun 10, 2024 at 10:41:19AM +0200, Jan Beulich wrote: > >> On 10.06.2024 10:28, Roger Pau Monné wrote: > >>> On Mon, Jun 10, 2024 at 09:58:11AM +0200, Jan Beulich wrote: > >>>> On 07.06.2024 21:52, Andrew Cooper wrote: > >>>>> On 07/06/2024 8:46 pm, Marek Marczykowski-Górecki wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I've got a new system, and it has two PCI segments: > >>>>>> > >>>>>> 0000:00:00.0 Host bridge: Intel Corporation Device 7d14 (rev 04) > >>>>>> 0000:00:02.0 VGA compatible controller: Intel Corporation Meteor > >>>>>> Lake-P [Intel Graphics] (rev 08) > >>>>>> ... > >>>>>> 10000:e0:06.0 System peripheral: Intel Corporation RST VMD Managed > >>>>>> Controller > >>>>>> 10000:e0:06.2 PCI bridge: Intel Corporation Device 7ecb (rev 10) > >>>>>> 10000:e1:00.0 Non-Volatile memory controller: Phison Electronics > >>>>>> Corporation PS5021-E21 PCIe4 NVMe Controller (DRAM-less) (rev 01) > >>>>>> > >>>>>> But looks like Xen doesn't handle it correctly: > >>> > >>> In the meantime you can probably disable VMD from the firmware and the > >>> NVMe devices should appear on the regular PCI bus. > >>> > >>>>>> (XEN) 0000:e0:06.0: unknown type 0 > >>>>>> (XEN) 0000:e0:06.2: unknown type 0 > >>>>>> (XEN) 0000:e1:00.0: unknown type 0 > >>>>>> ... > >>>>>> (XEN) ==== PCI devices ==== > >>>>>> (XEN) ==== segment 0000 ==== > >>>>>> (XEN) 0000:e1:00.0 - NULL - node -1 > >>>>>> (XEN) 0000:e0:06.2 - NULL - node -1 > >>>>>> (XEN) 0000:e0:06.0 - NULL - node -1 > >>>>>> (XEN) 0000:2b:00.0 - d0 - node -1 - MSIs < 161 > > >>>>>> (XEN) 0000:00:1f.6 - d0 - node -1 - MSIs < 148 > > >>>>>> ... > >>>>>> > >>>>>> This isn't exactly surprising, since pci_sbdf_t.seg is uint16_t, so > >>>>>> 0x10000 doesn't fit. OSDev wiki says PCI Express can have 65536 PCI > >>>>>> Segment Groups, each with 256 bus segments. > >>>>>> > >>>>>> Fortunately, I don't need this to work, if I disable VMD in the > >>>>>> firmware, I get a single segment and everything works fine. > >>>>>> > >>>>> > >>>>> This is a known issue. Works is being done, albeit slowly. > >>>> > >>>> Is work being done? After the design session in Prague I put it on my > >>>> todo list, but at low priority. I'd be happy to take it off there if I > >>>> knew someone else is looking into this. > >>> > >>> We had a design session about VMD? If so I'm afraid I've missed it. > >> > >> In Prague last year, not just now in Lisbon. > >> > >>>>> 0x10000 is indeed not a spec-compliant PCI segment. It's something > >>>>> model specific the Linux VMD driver is doing. > >>>> > >>>> I wouldn't call this "model specific" - this numbering is purely a > >>>> software one (and would need coordinating between Dom0 and Xen). > >>> > >>> Hm, TBH I'm not sure whether Xen needs to be aware of VMD devices. > >>> The resources used by the VMD devices are all assigned to the VMD > >>> root. My current hypothesis is that it might be possible to manage > >>> such devices without Xen being aware of their existence. > >> > >> Well, it may be possible to have things work in Dom0 without Xen > >> knowing much. Then Dom0 would need to suppress any physdevop calls > >> with such software-only segment numbers (in order to at least not > >> confuse Xen). I'd be curious though how e.g. MSI setup would work in > >> such a scenario. > > > > IIRC from my read of the spec, > > So you have found a spec somewhere? I didn't so far, and I had even asked > Intel ... > > > VMD devices don't use regular MSI > > data/address fields, and instead configure an index into the MSI table > > on the VMD root for the interrupt they want to use. It's only the VMD > > root device (which is a normal device on the PCI bus) that has > > MSI(-X) configured with real vectors, and multiplexes interrupts for > > all devices behind it. > > > > If we had to passthrough VMD devices we might have to intercept writes > > to the VMD MSI(-X) entries, but since they can only be safely assigned > > to dom0 I think it's not an issue ATM (see below). > > > >> Plus clearly any passing through of a device behind > >> the VMD bridge will quite likely need Xen involvement (unless of > >> course the only way of doing such pass-through was to pass on the > >> entire hierarchy). > > > > All VMD devices share the Requestor ID of the VMD root, so AFAIK it's > > not possible to passthrough them (unless you passthrough the whole VMD > > root) because they all share the same context entry on the IOMMU. > > While that was my vague understanding too, it seemed too limiting to me > to be true. I my case, it was a single NVMe disk behind this VMD thing, so passing through the whole VMD device wouldn't be too bad. I have no idea (nor really interest in...) how it behaves with more disks. From the above discussion I understand the 0x10000 segment is really a software construct, not anything that hardware expects, so IMO dom0 shouldn't tell Xen anything about it. Since I have the hardware, I can do some more tests if somebody is interested in results. But for now I have disabled VMD in firmware and everything is fine. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |