[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for virtual IOMMU binding b/w vIOMMU and passthrough devices


  • To: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Julien Grall <julien@xxxxxxx>
  • From: Michal Orzel <michal.orzel@xxxxxxx>
  • Date: Fri, 28 Oct 2022 18:54:13 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=arm.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HsiXk5Auw6kSqGXPU3wrfNMKh0U606xi/6ufKiLIiu0=; b=mkE+HKpMCmpGDM8O81Ahc9BvagH0qCCgLj9J3xsf0rRBpcV1Y1BTn/UIc/h2gtiX9DR1DqUjA9+6mMFrkC+1IeWZTjW9OpqCdxA1OU+d4xLHQis2HL3gL48r6VP6IwAmfMg+lPESBzXLRCCLL1x+PU9R3daxtQYJBLufqD/ZuXGdpo0kVTLyo6Lj+2Rkbry0zglrTEVuOU/zfQ3qBS/ftIMbqk/ivlzS7zLfdwwKNd4BFl7kDruDcODQduqkAry7VV3s1Z9+rOIPNMH86A3BuRoHrE9FTef7etCiib3ywP+o0+KJ6BrwwHbLFCAZkZvJBWh4u8tMVGN5W8sq7+8vDA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KQ/eUIsvrcW7MXSySOIeyUgRmbiJhD+f5LkY9uHnWPdpFP+ycWl2p4Tk4RzR6B5fnBMdUBTosQ6YUB2YI+vKQXo1Au8el2r/VTQbvYxW5pmPbgi8xWmUyvaZqzxKbEnOiVej73yiEiSASIlMDXVYrLAUn1GtgyhMTGcf6fQZ61Fn4OQpHL+H4/WddxBXPiLHlWvgUy+gAOr0xDJUGE6K/B0/aPzHsbWRywaP7TL2ChZbStTNYruyBR9y2V5fnsdx2VxYV3HNm95zZo5XSIxa7KrWPkH3CEGbEsaNQ3uIknHuzI6qC9Fp02vkO3SB4f+4GY502OO0/hNw1yB7pSo4eg==
  • Cc: Rahul Singh <Rahul.Singh@xxxxxxx>, Xen developer discussion <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Michal Orzel <Michal.Orzel@xxxxxxx>, "Oleksandr Tyshchenko" <Oleksandr_Tyshchenko@xxxxxxxx>, Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>
  • Delivery-date: Fri, 28 Oct 2022 16:54:39 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28/10/2022 17:45, Bertrand Marquis wrote:
> 
> 
> Hi Julien,
> 
>> On 28 Oct 2022, at 16:01, Julien Grall <julien@xxxxxxx> wrote:
>>
>>
>>
>> On 28/10/2022 15:37, Bertrand Marquis wrote:
>>> Hi Julien,
>>
>> Hi Bertrand,
>>
>>>> On 28 Oct 2022, at 14:27, Julien Grall <julien@xxxxxxx> wrote:
>>>>
>>>>
>>>>
>>>> On 28/10/2022 14:13, Bertrand Marquis wrote:
>>>>> Hi Julien,
>>>>
>>>> Hi Bertrand,
>>>>
>>>>>> On 28 Oct 2022, at 14:06, Julien Grall <julien@xxxxxxx> wrote:
>>>>>>
>>>>>> Hi Rahul,
>>>>>>
>>>>>> On 28/10/2022 13:54, Rahul Singh wrote:
>>>>>>>>>>>> For ACPI, I would have expected the information to be found in the 
>>>>>>>>>>>> IOREQ.
>>>>>>>>>>>>
>>>>>>>>>>>> So can you add more context why this is necessary for everyone?
>>>>>>>>>>> We have information for IOMMU and Master-ID but we don’t have 
>>>>>>>>>>> information for linking vMaster-ID to pMaster-ID.
>>>>>>>>>>
>>>>>>>>>> I am confused. Below, you are making the virtual master ID optional. 
>>>>>>>>>> So shouldn't this be mandatory if you really need the mapping with 
>>>>>>>>>> the virtual ID?
>>>>>>>>> vMasterID is optional if user knows pMasterID is unique on the 
>>>>>>>>> system. But if pMasterId is not unique then user needs to provide the 
>>>>>>>>> vMasterID.
>>>>>>>>
>>>>>>>> So the expectation is the user will be able to know that the pMasterID 
>>>>>>>> is uniq. This may be easy with a couple of SMMUs, but if you have 50+ 
>>>>>>>> (as suggested above). This will become a pain on larger system.
>>>>>>>>
>>>>>>>> IHMO, it would be much better if we can detect that in libxl (see 
>>>>>>>> below).
>>>>>>> We can make the vMasterID compulsory to avoid complexity in libxl to 
>>>>>>> solve this
>>>>>>
>>>>>> In general, complexity in libxl is not too much of problem.
>>>>> I am a bit unsure about this strategy.
>>>>> Currently xl has one configuration file where you put all Xen parameters. 
>>>>> The device tree is only needed by some guests to have a description of 
>>>>> the system they run on.
>>>>> If we change the model and say that Xen configuration parameters are both 
>>>>> in the configuration and the device tree, we somehow enforce to have a 
>>>>> device tree even though some guests do not need it at all (for example 
>>>>> Zephyr).
>>>>
>>>> I think my approach was misunderstood because there is no change in the 
>>>> existing model.
>>>>
>>>> What I am suggesting is to not introduce iommu_devid_map but instead let 
>>>> libxl allocate the virtual Master-ID and create the mapping with the 
>>>> physical Master-ID.
>>>>
>>>> Libxl would then update the property "iommus" in the device-tree with the 
>>>> allocated virtual Master-ID.
>>> Ok I understand now.
>>>>
>>>> Each node in the partial device-tree would need to have a property
>>>> to refer to the physical device just so we know how to update the 
>>>> "iommus". The list of device passthrough will still be specified in the 
>>>> configuration file. IOW, the partial device-tree is not directly involved 
>>>> in the configuration of the guest.
>>> But we will generate it. How would something like Zephyr guest work ? 
>>> Zephyr is not using the device tree we pass, it has an embedded one.
>>
>> In general, guest that don't use the device-tree/ACPI table to detect the 
>> layout are already in a bad situation because we don't guarantee that the 
>> layout (memory, interrupt...) will be stable across Xen version. Although, 
>> there are a implicit agreement that the layout will not change for minor 
>> release (i.e. 4.14.x).
> 
> Well right now we have no ACPI support.
> But I still think that a non dtb guest is definitely a use case we need to 
> keep in mind for embedded and safety as most proprietary RTOS are not using a 
> device tree.
> 
>>
>> But see below for some suggestions how this could be handled.
>>
>>>>
>>>> So far, I don't see a particular issue with this approach because the 
>>>> vMaster ID algorithm allocation should be generic. But please let me know 
>>>> if you think there are bits I am missing.
>>> I am a bit afraid of things that are “automatic”.
>>> For everything else we let the user in control (IPA for mapping, virtual 
>>> interrupt number) and in this case we switch to a model where we 
>>> automatically generated a vMaster ID.
>>
>> We only let the user control where the device is mapped. But this is quite 
>> fragile... I think this should be generated at runtime.
>>
>>> With this model, guest not using the device tree will have to guess the 
>>> vMaster ID or somehow know how the tools are generating it to use the right 
>>> one.
>>
>> To be honest, this is already the case today because the layout exposed to 
>> the guest is technically not fixed. Yes, so far, we haven't changed it too 
>> much. But sooner or later, this is going to bite because we made clear that 
>> the layout is not stable.
>>
>> Now, if those projects are willing to rebuild for each version, then we 
>> could use the following approach:
>>  1) Write the xl.cfg
>>  2) Ask libxl to generate the device-tree
>>  3) Build Zephyr
>>  4) Create the domain
>>
>> The expectation is for a given Xen version (and compatible), libxl will 
>> always generate the same Device-Tree.
> 
> This is a good idea yes :-)

Zephyr still uses a device tree but in a static way - everything must be 
defined in a .dts before building it.
The steps mentioned by Julien are already followed by Zephyr when building it 
to run as a Xen VM.
You can take a look at the "Updating configuration" section at the bottom of 
the following site:
https://docs.zephyrproject.org/latest/boards/arm64/xenvm/doc/index.html

So, as we tend to use Zephyr as a de facto RTOS for Xen, it is already aware of 
possible changes to the layout.

> 
> Cheers
> Bertrand
> 
>>
>> Cheers,
>>
>> --
>> Julien Grall
> 

~Michal



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.