[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] tools/light: Revoke permissions when a PCI detach for HVM domain



Hi Anthony,

On 18/08/2023 18:04, Anthony PERARD wrote:
On Wed, Aug 09, 2023 at 11:33:05AM +0100, Julien Grall wrote:
From: Julien Grall <jgrall@xxxxxxxxxx>

Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
a PCI is attached (see pci_add_dm_done()) for all domain types. However,
the permissions are only revoked for non-HVM domain (see do_pci_remove()).

This means that HVM domains will be left with extra permissions. While
this look bad on the paper, the IRQ permissions should be revoked
when the Device Model call xc_physdev_unmap_pirq() and such domain
cannot directly mapped I/O port and IOMEM regions. Instead, this has to
be done by a Device Model.

The Device Model can only run in dom0 or PV stubdomain (upstream libxl
doesn't have support for HVM/PVH stubdomain).

For PV/PVH stubdomain, the permission are properly revoked, so there is
no security concern.

This leaves dom0. There are two cases:
   1) Privileged: Anyone gaining access to the Device Model would already
      have large control on the host.
   2) Deprivileged: PCI passthrough require PHYSDEV operations which
      are not accessible when the Device Model is restricted.

So overall, it is believed that the extra permissions cannot be exploited.

Rework the code so the permissions are all removed for HVM domains.
This needs to happen after the QEMU has detached the device. So
the revocation is now moved in a separate function which is called
from pci_remove_detached().

Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx>

---

TODO: I am getting a bit confused with the async work in libxl. I am
not entirely sure whether pci_remove_detached() is the correct place
to revoke.

Whenever an async task in libxl takes more than one function to
complete, the next function (or callback) that is going to be executed
is further down in the current source file (usually). This is to try to
avoid too much confusion when reading through a set of async calls. So
pci_remove_detached() is after all the DM stuff are done, and it's
before we deal with stubdom case which will go through these step again,
so it seems appropriate.

Ah I didn't realize there was a logic in the ordering. This will help to understand the code in the future.


So, this new pci_revoke_permissions() function been place before
do_pci_remove() will make it harder to follow what do_pci_remove() does.
Does it need to be a separate function? Can't you inline it in
pci_remove_detached() ?

I decided to go with an inline function to avoid increasing the size of pci_remove_detached() and also separate the logic from cleaning-up QMP an resetting the PCI device.


If it does needs to be a separate function, a better way to lay it down
would be to replace calls to pci_remove_detached() by
pci_revoke_permissions() as appropriate, and rename it with the prefixed
"pci_remove_", that is pci_remove_revoke_permissions().

I don't understand this suggestion. pci_revoke_permissions() is called right in the middle of pci_remove_detached(). So it is not clear how it can be called ahead.

Also, if I replace pci_remove_detached() with pci_revoke_permissions(), does this mean you are expecting the latter to call the former?


TODO: For HVM, we are now getting the following error on detach:
libxl: error: libxl_pci.c:2009:pci_revoke_permissions: Domain 
3:xc_physdev_unmap_pirq irq=23: Invalid argument

This is because the IRQ was unmapped by QEMU. It doesn't feel
right to skip the call. So maybe we can ignore the error?

The error is already ignore. But I guess you just want to skip writing
an error message. But I think we should still write something, at least
a DEBUG message. Also add a comment that QEMU also unmap it, so errors
are expected.

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 7f5f170e6eb0..f5a4b88eb2c0 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1980,75 +2052,19 @@ static void do_pci_remove(libxl__egc *egc, 
pci_remove_state *prs)
              prs->xswait.timeout_ms = LIBXL_DEVICE_MODEL_START_TIMEOUT * 1000;
              prs->xswait.callback = pci_remove_qemu_trad_watch_state_cb;
              rc = libxl__xswait_start(gc, &prs->xswait);
-            if (rc) goto out_fail;
-            return;
+            if (!rc) return;

This is confusing, we usually check for error condition in libxl, not
success condition. So the currently written code is better.

+            break;
          case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
              pci_remove_qmp_device_del(egc, prs); /* must be last */
              return;
          default:
              rc = ERROR_INVAL;
-            goto out_fail;
+            break;

You can keep the goto here, this is the usual way to deal with error. > (except a label 
named "out" would be more appropriate, but out_fail is
fine).

          }
      } else {
+        rc = 0;

You don't need to set rc in the else block and just set it after the if.

This was done this way because the code after could be called from an error path. So setting 'rc = 0' would be wrong.

Anyway, above, you said 'goto' is the way to deal with error in libxl. So it would be possible to use 'rc = 0' below.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.