[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/2] IOMMU/x86: disallow device assignment to PoD guests


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 3 Feb 2022 11:23:51 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MzP9lXHShiSL2H6N8/+SSVhd7ck4bHF0hXzEA+fyXck=; b=AlTlKWcf9k5LpX+Ge56N6FPWDgk71+cwysLhOVbG8Hanl3jwzRCBLu1ZmQhD7svq+DWMvfmZpeChBfs/88ekMX65bzD70VPx6os2TsphU7zO1OTOUGn5PilaZEN96oBnZLrk04LcuBffPqkOlOUGhh33zI4KpRewGoTmZWC4/mCrxJijfXVkugis1njZvZ3f1TLgw741LWjSdsoDAYcptES3lejLfkpjAWHoU95JXZtNFkRv7TzYbZra7m1lrUj61nGavzuKVKXX3QjEOGIjjjMB5Misgs/AoCmEAKjqlR9gFHWV9QVszDSrwKyWl6CfqaPF8vOFuQIAkzCHRlcYtQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TYAQa/VHkR9SbhHtXOJqPP3Q+IbaVs/R+/Bp/N0XAa5RnNAIG69mBn3q7828zDlFbdyEOnMzlSWidL+jMTPV4y9i4mt5dHpPUc1ZPlQeYctp/9SFj1Oc0aMJ1Mg+b+JV7obPrYb7A02g1tWtoj1oryBz/j/o+XiegutZZf3GbY127ELPcQ2Gt6JMkSQrAFNufuuohMOhofYvn7FnjM1u4j08BmlQTgYmXXqOPMK5QBK8rVrk2e6X51BebKvD31QFMPFpfXWTk+/Jqd60v+cSs6OBkU0cUVkySYFCk/vehbfnyrD0LsHbuUbB9vjkEoTSTAnfdQ52WHHNmKjldlk13w==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Petre Pircalabu <ppircalabu@xxxxxxxxxxxxxxx>, Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 03 Feb 2022 10:24:15 +0000
  • Ironport-data: A9a23:/t4S2q2Q/VlUaZMhFvbD5W12kn2cJEfYwER7XKvMYLTBsI5bp2dTn TAfXTrVPqmJNGHzLd1wbo21oBsDvZbdyNFmGlFppC1hF35El5HIVI+TRqvS04J+DSFhoGZPt Zh2hgzodZhsJpPkS5PE3oHJ9RGQ74nRLlbHILOCanAZqTNMEn9700o5w7Vh2+aEvPDia++zk YKqyyHgEAfNNw5cagr4PIra9XuDFNyr0N8plgRWicJj5TcypFFMZH4rHomjLmOQf2VhNrXSq 9Avbl2O1jixEx8FUrtJm1tgG6EAaua60QOm0hK6V0U+6/TrS+NbPqsTbZIhhUlrZzqhpIB97 NFRrZ6JdF1zH63FwcoSYjZROnQrVUFG0OevzXmXtMWSywvNcmf2wuUoB0YzVWEa0r8pWycUr 6VecW1TKEDY7w616OvTpu1EnMMsIdOtJIoCknph0SvYHbAtRpWrr6Diu4UIgm9t250m8fD2P 5A7TANLdR35QTZQZlkFU5FixOyKiSyqG9FfgA3M/vdmi4TJ9yRt2b3kK/LJediHQ8pEk0Ler WXDl0zhGhAAP9WbwDCY2nitmuPCky79VI8IUra/85ZCiVyIz20XATUcVEe3rPe0jEKzQZRUL El80jInsKwa5EGtCN7nUHWQonSJoxodUNp4CPAh5UeGza+8yxaUAC0IQyBMbPQitdQqXno62 1mRhdTrCDdz9rqPRhq16bO8vT60fy8PIgc/iTQsFFVfpYO5+cdq00yJHo0L/LOJYsPdImH85 zSWtCoHuu9JrM4Cjou0zH/Oqmf5znTWdTId6gLSV2Ojywp2Yo+5eoClgWTmAeZ8wJWxFQfY4 iVd8ySKxKVXVMzWynTRKAkYNOzxv5643CvgbUmD9nXL3xCk4DadcI9Z+1mSz285Y59fKVcFj KI+0D69BaO/3lP3NcebgKrrUqzGKJQM8/y/DZg4ifIVOvBMmPevpn0GWKJp9zmFfLIQua8+I 4yHVs2nEGwXD69qpBLvGbtGgeR2mH1vnjyILXwe8/hB+eDFDJJyYexdWGZik8hjtP/UyOkr2 4o32zS2J+V3D7SlP3i/HX87JlEWN3krba0aWOQMHtNv1jFOQTl7Y9eImOtJU9U8w8x9y7mUl lngBB4w4Aev1BXvdFTRAlg+OeyHYHqKhS9hVcDaFQz2iyFLjEfGxPp3SqbbipF8pbE6kK4pE 6JYEyhCa9wWIgn6F/0mRcCVhKRpdQixhBLIOCygYTMleIVnSRCP8djhFjYDPgFXZsZuncdh8 bCmyC3BRp8PG1ZrAMrMMar9xFKtp3kN3ul1WhKQcNVUfUzt9qlsKjDw0aBrc51dd02by2vIz RuSDDcZufLJ/90//u7WiP3WtIyuCeZ/QBZXRjGJ8basOCDG1WO/2oscAv2QdDXQWTqsqqWvb OlY1d/mN/gDkAoYuoZwCe8zn6k/+8Hut/lRyQE9RCfHaFGiC7VBJHia3JYQ6v0Rl+EB4QbvA xCB4NhXP7mNKfjJKl9JKVp3dPmH2NEVhiLWsaY/LnLl6XIl57GAS0hTYUWB0XQPMLtvPYo56 u49o8pKuRengx8nP9va3CBZ82OAci4JX6k978xIBYbqjkwgy01YYIyaASjzucndZ9JJO0gsA zmVmKud2OgMmhucKyI+RSrXwO5QpZUSoxQbnlYNKmOAlsfBmvJqjgZa9i46T1gNwxhKuw6p1 rOH66GhyX2ywgpV
  • Ironport-hdrordr: A9a23:UgCF5KEiF8xQVFqQpLqFcpHXdLJyesId70hD6qkvc3Jom52j+P xGws526faVslYssHFJo6HnBEClewKgyXcT2/hsAV7CZnidhILMFuBfBOTZsljd8kHFh4pgPO JbAtdD4b7LfChHZKTBkXGF+r8bqbHtms3Y5pa9854ud3AQV0gJ1XYJNu/xKDwOeOApP+tfKH LKjfA32QZINE5nJPiTNz0gZazuttfLnJXpbVovAAMm0hCHiXeN5KThGxaV8x8CW3cXqI1SvV Ttokjc3OGOovu7whjT2yv66IlXosLozp9mCNaXgsYYBz3wgkKDZZhnWZeFoDcpydvfpWoCoZ 3pmVMNLs5z43TeciWcpgbs4RDp1HIU53rr2Taj8DLeiP28YAh/J9tKhIpffBecwVEnpstA3K VC2H/cn4ZLDDvb9R6NpuTgZlVPrA6ZsHAimekcgzh0So0FcoJcqoQZ4Qd8DIoAJiTn84oqed MeQv003MwmMm9yUkqp/FWGmLeXLzEO91a9Mwc/U/WuonhrdCsT9Tpd+CQd9k1wgq7VBaM0oN gsCZ4Y5o2mePVmGp6VNN1xMvdfNVa9NC4kEFjiaWgPR5t3cE4klfbMkcEIDaeRCdo18Kc=
  • Ironport-sdr: SySO0xn3Xnu8hcIaDr2Ic8XOnXMOVIbAWQ3qAzx1am13FuFKgPZfc+ixvrZ57C1oLQ8eCIp5+b Synfk3zGBagOVyZezOhfpvURHluLoVX9FkeQds23AlfT63H6nstmFioILQGg9sbmFl0MSoMHmz ef5/h+HXhTbdHehbfdFApXpnCJosbL8AxP82flvQOuo5F0mmT6f/wTEVJ9eRhL6olVs6El1JYa A4Fy53rmXUFzKxUBTWmVM81w7RMDfRuPjeLyPYxGC7GqjwJlckGGsDBV+5e6gNRYLjwY1yiB4H ty/jYFBvQ7eTeYWmvV6UuO5G
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Feb 03, 2022 at 11:20:15AM +0100, Jan Beulich wrote:
> On 03.02.2022 10:52, Roger Pau Monné wrote:
> > On Thu, Feb 03, 2022 at 10:21:54AM +0100, Jan Beulich wrote:
> >> On 03.02.2022 10:04, Roger Pau Monné wrote:
> >>> On Thu, Feb 03, 2022 at 09:31:03AM +0100, Jan Beulich wrote:
> >>>> On 02.02.2022 17:13, Roger Pau Monné wrote:
> >>>>> On Tue, Jan 04, 2022 at 10:41:32AM +0100, Jan Beulich wrote:
> >>>>>> @@ -359,7 +360,10 @@ p2m_pod_set_mem_target(struct domain *d,
> >>>>>>  
> >>>>>>      ASSERT( pod_target >= p2m->pod.count );
> >>>>>>  
> >>>>>> -    ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/);
> >>>>>> +    if ( has_arch_pdevs(d) || cache_flush_permitted(d) )
> >>>>>
> >>>>> Is it possible to have cache flush allowed without any PCI device
> >>>>> assigned? AFAICT the iomem/ioport_caps would only get setup when there
> >>>>> are device passed through?
> >>>>
> >>>> One can assign MMIO or ports to a guest the raw way. That's not
> >>>> secure, but functionally explicitly permitted.
> >>>>
> >>>>> TBH I would be fine if we just say that PoD cannot be used in
> >>>>> conjunction with an IOMMU, and just check for is_iommu_enable(d) here.
> >>>>>
> >>>>> I understand it's technically possible for PoD to be used together
> >>>>> with a domain that will later get a device passed through once PoD is
> >>>>> no longer in use, but I doubt there's much value in supporting that
> >>>>> use case, and I fear we might be introducing corner cases that could
> >>>>> create issues in the future. Overall I think it would be safer to just
> >>>>> disable PoD in conjunction with an IOMMU.
> >>>>
> >>>> I consider it wrong to put in place such a restriction, but I could
> >>>> perhaps accept you and Andrew thinking this way if this was the only
> >>>> aspect playing into here. However, this would then want an equivalent
> >>>> tools side check, and while hunting down where to make the change as
> >>>> done here, I wasn't able to figure out where that alternative
> >>>> adjustment would need doing. Hence I would possibly(!) buy into this
> >>>> only if someone else took care of doing so properly in the tool stack
> >>>> (including the emission of a sensible error message).
> >>>
> >>> What about the (completely untested) chunk below:
> >>>
> >>> diff --git a/tools/libs/light/libxl_create.c 
> >>> b/tools/libs/light/libxl_create.c
> >>> index d7a40d7550..e585ef4c5c 100644
> >>> --- a/tools/libs/light/libxl_create.c
> >>> +++ b/tools/libs/light/libxl_create.c
> >>> @@ -1160,17 +1160,16 @@ int libxl__domain_config_setdefault(libxl__gc *gc,
> >>>      pod_enabled = (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV) &&
> >>>          (d_config->b_info.target_memkb < d_config->b_info.max_memkb);
> >>>  
> >>> -    /* We cannot have PoD and PCI device assignment at the same time
> >>> +    /* We cannot have PoD and an active IOMMU at the same time
> >>>       * for HVM guest. It was reported that IOMMU cannot work with PoD
> >>>       * enabled because it needs to populated entire page table for
> >>> -     * guest. To stay on the safe side, we disable PCI device
> >>> -     * assignment when PoD is enabled.
> >>> +     * guest.
> >>>       */
> >>>      if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV &&
> >>> -        d_config->num_pcidevs && pod_enabled) {
> >>> +        d_config->c_info.passthrough != LIBXL_PASSTHROUGH_DISABLED &&
> >>> +        pod_enabled) {
> >>>          ret = ERROR_INVAL;
> >>> -        LOGD(ERROR, domid,
> >>> -             "PCI device assignment for HVM guest failed due to PoD 
> >>> enabled");
> >>> +        LOGD(ERROR, domid, "IOMMU cannot be enabled together with PoD");
> >>>          goto error_out;
> >>>      }
> >>
> >> Perhaps. Seeing this I actually recall coming across this check during
> >> my investigation. Not changing it along the lines of what you do was
> >> then really more because of me not being convinced of the extra
> >> restriction; I clearly misremembered when writing the earlier reply.
> >> If we were to do what you suggest, I'd like to ask that the comment be
> >> changed differently, though: "We cannot ..." then isn't really true
> >> anymore. We choose not to permit this mode; "cannot" only applies to
> >> actual device assignment (and of course only as long as there aren't
> >> restartable IOMMU faults).
> > 
> > I'm fine with an adjusted wording here. This was mostly a placement
> > suggestion, but I didn't gave much thought to the error message.
> 
> FTAOD: Are you going to transform this into a proper patch then? While
> I wouldn't object to such a behavioral change, I also wouldn't want to
> put my name under it. But if it went in, I think I might be able to
> then drop the libxl adjustment from my patch.

Oh, I somewhat assumed you would integrate this check into the patch.
I can send a standalone patch myself if that's your preference. Let me
do that now.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.