[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.
Hi, On 25/10/17 09:22, Manish Jaggi wrote: > > > On 10/23/2017 7:27 PM, Andre Przywara wrote: >> Hi Manish, >> >> On 12/10/17 22:03, Manish Jaggi wrote: >>> ACPI/IORT Support in Xen. >>> -------------------------------------- >>> >>> I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending >>> the scope >>> and including all that is required to support ACPI/IORT in Xen. >>> Presenting for review >>> first _draft_ of design of ACPI/IORT support in Xen. Not complete >>> though. >>> >>> Discussed is the parsing and generation of IORT table for Dom0 and >>> DomUs. >>> It is proposed that IORT be parsed and the information in saved into xen >>> data-structure >>> say host_iort_struct and is reused by all xen subsystems like ITS / SMMU >>> etc. >>> >>> Since this is first draft is open to technical comments, modifications >>> and suggestions. Please be open and feel free to add any missing points >>> / additions. >>> >>> 1. What is IORT. What are its components ? >>> 2. Current Support in Xen >>> 3. IORT for Dom0 >>> 4. IORT for DomU >>> 5. Parsing of IORT in Xen >>> 6. Generation of IORT >>> 7. Future Work and TODOs >>> >>> 1. What is IORT. What are its components ? >>> -------------------------------------------- >>> IORT refers to Input Output remapping table. It is essentially used >>> to find >>> information about the IO topology (PCIRC-SMMU-ITS) and relationships >>> between >>> devices. >>> >>> A general structure of IORT is has nodes which have information about >>> PCI RC, >>> SMMU, ITS and Platform devices. Using an IORT table relationship between >>> RID -> StreamID -> DeviceId can be obtained. More specifically which >>> device is >>> behind which SMMU and which interrupt controller, this topology is >>> described in >>> IORT Table. >>> >>> RID is a requester ID in PCI context, >>> StreamID is the ID of the device in SMMU context, >>> DeviceID is the ID programmed in ITS. >>> >>> For a non-pci device RID could be simply an ID. >>> >>> Each iort_node contains an ID map array to translate from one ID into >>> another. >>> IDmap Entry {input_range, output_range, output_node_ref, id_count} >>> This array is present in PCI RC node,SMMU node, Named component node etc >>> and can reference to a SMMU or ITS node. >>> >>> 2. Current Support of IORT >>> --------------------------- >>> Currently Xen passes host IORT table to dom0 without any modifications. >>> For DomU no IORT table is passed. >>> >>> 3. IORT for Dom0 >>> ----------------- >>> IORT for Dom0 is prepared by xen and it is fairly similar to the host >>> iort. >>> However few nodes could be removed removed or modified. For instance >>> - host SMMU nodes should not be present >>> - ITS group nodes are same as host iort but, no stage2 mapping is done >>> for them. >> What do you mean with stage2 mapping? > Please ignore this line. Copy paste error. Read it as follows > > - ITS group nodes are same as host iort. > (though I would modify the same as in next draft) > >> >>> - platform nodes (named components) may be selectively present depending >>> on the case where xen is using some. This could be controlled by xen >>> command >>> line. >> Mmh, I am not so sure platform devices described in the IORT (those >> which use MSIs!) are so much different from PCI devices here. My >> understanding is those platform devices are network adapters, for >> instance, for which Xen has no use. > ok. >> So I would translate "Named Components" or "platform devices" as devices >> just not using the PCIe bus (so no config space and no (S)BDF), but >> being otherwise the same from an ITS or SMMU point of view. > Correct. >>> - More items : TODO >> I think we agreed upon rewriting the IORT table instead of patching it? > yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was > _rewriting_ most of Dom0 IORT and not patching it. I was just after the wording above: "IORT for Dom0 is prepared by xen and it is fairly similar to the host iort. However few nodes could be removed removed or modified." ... which sounds a bit like you alter the h/w IORT. It would be good to clarify this by explicitly mentioning the parsing/generation cycle, as this is a fundamental design decision. > We can have a IRC discussion on this. > > I think apart from rewriting, the other tasks which were required that > are handled in this epic task > - parse IORT and save in xen internal data structures > - common code to generate IORT for dom0/domU > - All xen code that parses IORT multiple times use now the xen internal > data structures. Yes, that sounds about right. > (I have explained this in this mail below) >> So to some degree your statements are true, but when we rewrite the IORT >> table without SMMUs (and possibly without other components like the >> PMUs), it would be kind of a stretch to call it "fairly similar to the >> host IORT". I think "based on the host IORT" would be more precise. > Yes. Based on host IORT is better,thanks. >> >>> 4. IORT for DomU >>> ----------------- >>> IORT for DomU is generated by the toolstack. IORT topology is different >>> when DomU supports device passthrough. >> Can you elaborate on that? Different compared to what? My understanding >> is that without device passthrough there would be no IORT in the first >> place? > I was exploring the possibility of having virtual devices for DomU. > So if a virtual is assigned to guest there needs to be some mapping in > IORT as well. > This virtual device can be on a PCI bus / or as a platform device. > > Device Pass-through can be split into two parts > a. platform device passthrough (not on PCI bus) > b. PCI device PT I understand that, but am still wondering how it would be "different". We just start with creating our mapping data structure *from scratch*, the same one we generate by *parsing* the host IORT. Whether this points to a purely virtual device, a PCI PT or a platform PT, should not matter for this purpose. > => If we discount the possibility of a virtual device for domU and > platform device passthrough > then you are correct no IORT is required. I believe we need an IORT once we have devices which use MSIs. > When PCI device passthrough is supported, the PCIRC is itself virtual > (emulated by Xen). > One can have any number of virtual PCIRC and may be virtual SMMUs. > Hence the topology can vary. I think I don't disagree, my initial comment was just about the confusion that this "IORT topology is *different* from" term created. > Now read the below lines. >>> At a minimum domU IORT should include a single PCIRC and ITS Group. >>> Similar PCIRC can be added in DSDT. >>> Additional node can be added if platform device is assigned to domU. >>> No extra node should be required for PCI device pass-through. >> Again I don't fully understand this last sentence. > The last line is continuation of the first line "At a minimum..." OK, but still I don't get how we would end up with an IORT without (pass-throughed) PCI devices in the first place? >>> It is proposed that the idrange of PCIRC and ITS group be constant for >>> domUs. >> "constant" is a bit confusing here. Maybe "arbitrary", "from scratch" or >> "independent from the actual h/w"? > ok. that is implementation defined. >> >>> In case if PCI PT,using a domctl toolstack can communicate >>> physical RID: virtual RID, deviceID: virtual deviceID to xen. >>> >>> It is assumed that domU PCI Config access would be trapped in Xen. The >>> RID at which assigned device is enumerated would be the one provided >>> by the >>> domctl, domctl_set_deviceid_mapping >>> >>> TODO: device assign domctl i/f. >>> Note: This should suffice the virtual deviceID support pointed by Andre. >>> [4] >> Well, there's more to it. First thing: while I tried to include virtual >> ITS deviceIDs to be different from physical ones, in the moment there >> are fixed to being mapped 1:1 in the code. > oh >> So the first step would be to go over the ITS code and identify where >> "devid" refers to a virtual deviceID and where to a physical one >> (probably renaming them accordingly). Then we would need a function to >> translate between the two. At the moment this would be a dummy function >> (just return the input value). Later we would loop in the actual table. > Some thought here.. > Wouldn't it be better to call a helper function to translate the devid > coming from guest. The helper function > would look at the table created by handling successive domctls (the one > mentioned here) Exactly. >>> We might not need this domctl if assign_device hypercall is extended to >>> provide this information. >> Do we actually need a new interface or even extend the existing one? >> If I got Julien correctly, the existing interface is just fine? > Could you explain which existing interface can be used to translate > guest device ID to host device ID when an ITS command gets trapped in Xen. > may be I am missing something here. I haven't looked in detail, but will do. >>> 5. Parsing of IORT in Xen >>> -------------------------- >>> IORT nodes can be saved in structures so that IORT table parsing can be >>> done once and is reused by all xen subsystems like ITS / SMMU etc, >>> domain >>> creation. >>> Proposed are the structures to hold IORT information, very similar to >>> ACPI >>> structures. >>> >>> iort_id_map { >>> range_t input_range; >>> range_t output_range; >>> void *output_reference; >>> ... >>> } >> I guess you would need a "struct list_head list" here to chain the >> ranges? > yes :). That was in ... >> >>> =>output_reference points to object of iort_node. >>> >>> struct iort_node { >>> struct list_head id_map; >>> void *context; >>> struct list_head list; >>> } >>> => context could be a reference to acpi_iort_node. >>> >>> struct iort_table_struct { >>> struct list_head pci_rc_nodes; >>> struct list_head smmu_nodes; >>> struct list_head plat_devices; >>> struct list_head its_group; >>> } >> So quickly brainstorming with Julien I was wondering if we could >> actually simplify this significantly: >> From Xen's point of view all we need to know is the mapping between PCI >> requestor IDs (or some platform device IDs) to the physical ITS device >> ID, and from requestor IDs to the SMMU stream ID. >> That would be just *two* lookup tables, not connected to each other >> aside from possibly having the same input ranges. At this point we could >> also have *one* table, containing both the ITS deviceID and the SMMU >> stream ID: >> >> struct iort_id_map { >> range_t input_range; >> uint32_t its_devid_base; >> uint32_t smmu_streamid_base; >> struct list_head list; >> }; > This is just a simpler case of what the spec supports. Sure, and without the need for a *chained* mapping (RC->SMMU->ITS) this is actually all we need (RC->ITS; RC->SMMU). I have the feeling that this simplifies the code a lot. > Each PCIRC node can have an array of idmaps, each id map entry can have > a multiple of idmap entries in > the output reference smmu idmap. > > I had a similar discussion with the v1 version of my IORT SMMU hidepatch > with julien. > > Moreover I dont quite understand the where iort_id_map would fit. > As if you see below reply, we have a other things to take care as well >> >> So parsing the IORT would create and fill a list of those structures. >> For a lookup we would just iterate over that list, find a matching entry >> and: >> return (input_id - match->input_range.base) + match->its_devid_base; >> >> Ideally we abstract this via some functions, so that we can later swap >> this for more efficient data structures should the need arise. >> >>> This structure is created at the point IORT table is parsed say from >>> acpi_iort_init. >>> It is proposed to use this structure information in >>> iort_init_platform_devices. >>> [2] [RFC v2 4/7] ACPI: arm: Support for IORT > I guess you missed this part. Without this the context is missed. > > The main purpose of this whole task is split into two parts > a. IORT parsing should be done once in Xen and later whenever IORT > parsing is required use xen internal data structures > (iort_table_struct) > > - this would be helpful for (b) below > - and for SMMU / platform devices initialization. > If you see [2], it again parses the IORT. > So the approach here is [2] and (b) should use same Xen internal > IORT data structures. I think I get this, I was just wondering why we would need a more a less exact replication of the IORT, with the pointering complicating things. I believe what we need is: 1) a mapping from a PCI-RC or PT-NC to stream IDs, for programming the SMMU in Xen 2) a mapping from a PCI-RC or PT-NC to ITS devIDs, for programming the ITS when being asked for by a guest (incl. Dom0) I think that the IORT is a streamlined and optimized representation of those mappings, which we don't necessarily need to replicate 1:1 in an in-memory data structure. But admittedly I haven't looked with too much details into this, so if you convince me that we need this graph structure, then so be it. > => For that reason [2]/[5] might need to be rebased on this task's > patch. <= > [5] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123080.html > > b. Generate IORT for Doms without patching Host IORT, rather regenerate > from xen internal data structures. > > based on this rationale, I think the data structures mentioned would be > required. > >>> >>> 6. IORT Generation >>> ------------------- >>> There would be a common code to generate IORT table from >>> iort_table_struct. >> That sounds useful, but we would need to be careful with sharing code >> between Xen and the tool stack. Has this actually been done before? > I added the code sharing part here, but I am not hopeful that this would > work as it would require lot of code change on toolstack. > A simple difference is that the acpi header structures have different > member variables. This is same for other structures. > So we might have to create a lot of defines in common code for sharing > and possibility of errors. > > See: struct acpi_header in acpi2_0.h (tools/libacpi) > and struct acpi_table_header in actbl.h (xen/include/acpi) > > That is why I preferred a domctl, so xen coud prepare IORT for DomU. I don't this it's justified to move a simple table generation task into Xen, just to allow code sharing. After all this does not require any Xen internal knowledge. So it should be done definitely in the toolstack. I think we should follow Julien's suggestion of looking at xen/common/libelf. Cheers, Andre. > If not code sharing then code duplication might also work (In that case > no domctl required) > We can discuss on this more... >> >>> a. For Dom0 >>> the structure (iort_table_struct) be modified to remove smmu nodes >>> and update id_mappings. >>> PCIRC idmap -> output refrence to ITS group. >>> (RID -> DeviceID). >>> >>> TODO: Describe algo in update_id_mapping function to map RID -> >>> DeviceID used >>> in my earlier patch [3] >> If the above approach works, this would become a simple list iteration, >> creating PCI rc nodes with the appropriate pointer to the ITS nodes. > Yes, it works. see [3] >> >>> b. For DomU >>> - iort_table_struct would have minimal 2 nodes (1 PCIRC and 1 ITS >>> group) >>> - populate a basic IORT in a buffer passed by toolstack( using a >>> domctl : domctl_prepare_dom_iort) >> I think we should reduce this to iterating the same data structure as >> for Dom0. Each pass-through-ed PCI device would possibly create one >> struct instance, and later on we do the same iteration as we do for >> Dom0. If that proves to be simple enough, we might even live with the >> code duplication between Xen and the toolstack. > Yes, thats the Idea. for domu and dom0 the IORT generation code would take > > iort_table_struct > as input. > >> >> Cheers, >> Andre. >> >>> - DSDT for the DomU is updated by toolstack to include a PCIRC. >>> - If a named component is added to domU that information is passed >>> in the >>> same/additional domctl. >>> - <TODO: domctl_prepare_dom_iort i/f > >>> Note: Julien I have tried to incorporate your suggestion for code >>> reuse. >>> >>> 7. References: >>> ------------- >>> [0] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg121667.html >>> [1] ARM DEN0049C: >>> http://infocenter.arm.com/help/topic/com.arm.doc.den0049c/DEN0049C_IO_Remapping_Table.pdf >>> >>> >>> [2] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123082.html >>> [3] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg121669.html: >>> update_id_mapping function. >>> [4] https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg123434.html >>> >>> > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |