[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 4/6] xen/x86: Allow stubdom access to irq created for msi.



On Wed, Jul 17, 2019 at 03:00:42AM +0200, Marek Marczykowski-Górecki wrote:
> Stubdomains need to be given sufficient privilege over the guest which it
> provides emulation for in order for PCI passthrough to work correctly.
> When a HVM domain try to enable MSI, QEMU in stubdomain calls
> PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> part of xc_domain_update_msi_irq. Allow for that as part of
> PHYSDEVOP_map_pirq.
> 
> This is not needed for PCI INTx, because IRQ in that case is known
> beforehand and the stubdomain is given permissions over this IRQ by
> libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> 
> create_irq() already grant IRQ access to hardware_domain, with
> assumption the device model (something managing this IRQ) lives there.
> Modify create_irq() to take additional parameter pointing at device
> model domain - which may be dom0 or stubdomain.  Save ID of the domain
> given permission, to revoke it in destroy_irq() - easier and cleaner
> than replaying logic of create_irq() parameter. Use domid instead of
> actual reference to the domain, because it might get destroyed before
> destroying IRQ (stubdomain is destroyed before its target domain). And
> it is not an issue, because IRQ permissions live within domain
> structure, so destroying a domain also implicitly revoke the permission.
> Potential domid reuse is detected by by checking if that domain does
> have permission over the IRQ being destroyed.
> 
> Then, adjust all callers to provide the parameter. In case of calls not
> related to stubdomain-initiated allocations, give it either
> hardware_domain (so the behavior is unchanged there), or NULL for
> interrupts used by Xen internally.
> 
> Inspired by 
> https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch
>  by Eric Chanudet <chanudete@xxxxxxxxxxxx>.
> 
> Signed-off-by: Simon Gaiser <simon@xxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> ---
> Changes in v3:
>  - extend commit message
> Changes in v4:
>  - add missing destroy_irq on error path
> Changes in v5:
>  - move irq_{grant,revoke}_access() to {create,destroy}_irq(), which
>    basically make it a different patch
>  - add get_dm_domain() helper
>  - do not give hardware_domain permission over IRQs used in Xen
>    internally
>  - rename create_irq argument to just 'd', to avoid confusion
>    when it's called by hardware domain
>  - verify that device is de-assigned before pci_remove_device call
>  - save ID of domain given permission in create_irq(), to revoke it in
>  destroy_irq()
>  - drop domain parameter from destroy_irq() and msi_free_irq()
>  - do not give hardware domain permission over IRQ created in
>  iommu_set_interrupt()
> ---
>  xen/arch/x86/hpet.c                      |  3 +-
>  xen/arch/x86/irq.c                       | 51 ++++++++++++++++++-------
>  xen/common/irq.c                         |  1 +-
>  xen/drivers/char/ns16550.c               |  2 +-
>  xen/drivers/passthrough/amd/iommu_init.c |  2 +-
>  xen/drivers/passthrough/pci.c            |  3 +-
>  xen/drivers/passthrough/vtd/iommu.c      |  3 +-
>  xen/include/asm-x86/irq.h                |  2 +-
>  xen/include/xen/irq.h                    |  1 +-
>  9 files changed, 50 insertions(+), 18 deletions(-)
> 
> diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
> index 4b08488..b4854ff 100644
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -11,6 +11,7 @@
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/numa.h>
> +#include <xen/sched.h>
>  #include <asm/fixmap.h>
>  #include <asm/div64.h>
>  #include <asm/hpet.h>
> @@ -368,7 +369,7 @@ static int __init hpet_assign_irq(struct 
> hpet_event_channel *ch)
>  {
>      int irq;
>  
> -    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
> +    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )

Shouldn't this be NULL? I don't think the hardware domain should be
allowed to play with the HPET IRQs?

>          return irq;
>  
>      ch->msi.irq = irq;
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index 2cac28a..dc5d302 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -164,10 +164,21 @@ int __init bind_irq_vector(int irq, int vector, const 
> cpumask_t *cpu_mask)
>      return ret;
>  }
>  
> +static struct domain *get_dm_domain(struct domain *d)
                                       ^ const
> +{
> +    return current->domain->target == d ? current->domain : hardware_domain;
> +}
> +
>  /*
>   * Dynamic irq allocate and deallocation for MSI
>   */
> -int create_irq(nodeid_t node)
> +
> +/*
> + * create_irq - allocate irq for MSI
> + * @d domain that will get permission over the allocated irq; this permission
> + * will automatically be revoked on destroy_irq
> + */
> +int create_irq(nodeid_t node, struct domain *d)
>  {
>      int irq, ret;
>      struct irq_desc *desc;
> @@ -200,18 +211,24 @@ int create_irq(nodeid_t node)
>          desc->arch.used = IRQ_UNUSED;
>          irq = ret;
>      }

I would assert that desc->creator_domid == DOMID_INVALID here, since
in the failure case creator_domid is not overwritten.

> -    else if ( hardware_domain )
> +    else if ( d )
>      {
> -        ret = irq_permit_access(hardware_domain, irq);
> +        ASSERT(d == current->domain);
> +        ret = irq_permit_access(d, irq);
>          if ( ret )
>              printk(XENLOG_G_ERR
> -                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
> -                   irq, ret);
> +                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
> +                   d->domain_id, irq, ret);
> +        else
> +            desc->creator_domid = d->domain_id;
>      }
>  
>      return irq;
>  }
>  
> +/*
> + * destroy_irq - deallocate irq for MSI
> + */
>  void destroy_irq(unsigned int irq)
>  {
>      struct irq_desc *desc = irq_to_desc(irq);
> @@ -220,14 +237,22 @@ void destroy_irq(unsigned int irq)
>  
>      BUG_ON(!MSI_IRQ(irq));
>  
> -    if ( hardware_domain )
> +    if ( desc->creator_domid != DOMID_INVALID )
>      {
> -        int err = irq_deny_access(hardware_domain, irq);
> +        struct domain *d = get_domain_by_id(desc->creator_domid);
>  
> -        if ( err )
> -            printk(XENLOG_G_ERR
> -                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> -                   irq, err);
> +        if ( d && irq_access_permitted(d, irq) ) {
> +            int err;
> +
> +            err = irq_deny_access(d, irq);
> +            if ( err )
> +                printk(XENLOG_G_ERR
> +                       "Could not revoke Dom%u access to IRQ%u (error %d)\n",
> +                       d->domain_id, irq, err);
> +        }
> +
> +        if ( d )
> +            put_domain(d);

Don't you need to set creator_domid = DOMID_INVALID in destroy_irq at
some point?

Or else a failure in create_irq could leak the irq to it's previous
owner. Note that init_one_irq_desc would only init the fields the
first time the IRQ is used, but not for subsequent usages AFAICT.

>      }
>  
>      spin_lock_irqsave(&desc->lock, flags);
> @@ -2058,7 +2083,7 @@ int map_domain_pirq(
>              spin_unlock_irqrestore(&desc->lock, flags);
>  
>              info = NULL;
> -            irq = create_irq(NUMA_NO_NODE);
> +            irq = create_irq(NUMA_NO_NODE, get_dm_domain(d));

Isn't it fine to just use current->domain here directly?

It's always going to be the current domain the one that calls
map_domain_pirq in order to get a PIRQ mapped for it's target
domain I think.

>              ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, 
> &info)
>                             : irq;
>              if ( ret < 0 )
> @@ -2691,7 +2716,7 @@ int allocate_and_map_msi_pirq(struct domain *d, int 
> index, int *pirq_p,
>          if ( irq == -1 )
>          {
>      case MAP_PIRQ_TYPE_MULTI_MSI:
> -            irq = create_irq(NUMA_NO_NODE);
> +            irq = create_irq(NUMA_NO_NODE, get_dm_domain(d));
>          }
>  
>          if ( irq < nr_irqs_gsi || irq >= nr_irqs )
> diff --git a/xen/common/irq.c b/xen/common/irq.c
> index f42512d..42b27a9 100644
> --- a/xen/common/irq.c
> +++ b/xen/common/irq.c
> @@ -16,6 +16,7 @@ int init_one_irq_desc(struct irq_desc *desc)
>      spin_lock_init(&desc->lock);
>      cpumask_setall(desc->affinity);
>      INIT_LIST_HEAD(&desc->rl_link);
> +    desc->creator_domid = DOMID_INVALID;
>  
>      err = arch_init_one_irq_desc(desc);
>      if ( err )
> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> index 189e121..ccc8b04 100644
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -719,7 +719,7 @@ static void __init ns16550_init_irq(struct serial_port 
> *port)
>      struct ns16550 *uart = port->uart;
>  
>      if ( uart->msi )
> -        uart->irq = create_irq(0);
> +        uart->irq = create_irq(0, NULL);
>  #endif
>  }
>  
> diff --git a/xen/drivers/passthrough/amd/iommu_init.c 
> b/xen/drivers/passthrough/amd/iommu_init.c
> index 4e76b26..50785e0 100644
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -781,7 +781,7 @@ static bool_t __init set_iommu_interrupt_handler(struct 
> amd_iommu *iommu)
>      hw_irq_controller *handler;
>      u16 control;
>  
> -    irq = create_irq(NUMA_NO_NODE);
> +    irq = create_irq(NUMA_NO_NODE, NULL);
>      if ( irq <= 0 )
>      {
>          dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index e886894..507b3d1 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -845,6 +845,9 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
>          if ( pdev->bus == bus && pdev->devfn == devfn )
>          {
> +            ret = -EBUSY;
> +            if ( pdev->domain && pdev->domain != hardware_domain )
> +                break;

This seems like an unlrelated fix?

ie: preventing device removal while in use by a domain different than
dom0?

Note that you don't need the pdev->domain != NULL check, just doing
pdev->domain != hardware_domain seems enough, since you don't
dereference the pdev->domain pointer in the expression (unless I'm
missing other usages below).

>              ret = iommu_remove_device(pdev);
>              if ( pdev->domain )
>                  list_del(&pdev->domain_list);
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 8b27d7e..79f9682 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1138,7 +1138,8 @@ static int __init iommu_set_interrupt(struct 
> acpi_drhd_unit *drhd)
>      struct irq_desc *desc;
>  
>      irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
> -                          : NUMA_NO_NODE);
> +                          : NUMA_NO_NODE,
> +                     NULL);
>      if ( irq <= 0 )
>      {
>          dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
> diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
> index c0c6e7c..5b24428 100644
> --- a/xen/include/asm-x86/irq.h
> +++ b/xen/include/asm-x86/irq.h
> @@ -155,7 +155,7 @@ int  init_irq_data(void);
>  void clear_irq_vector(int irq);
>  
>  int irq_to_vector(int irq);
> -int create_irq(nodeid_t node);
> +int create_irq(nodeid_t node, struct domain *d);
>  void destroy_irq(unsigned int irq);
>  int assign_irq_vector(int irq, const cpumask_t *);
>  
> diff --git a/xen/include/xen/irq.h b/xen/include/xen/irq.h
> index 586b783..c7a6a83 100644
> --- a/xen/include/xen/irq.h
> +++ b/xen/include/xen/irq.h
> @@ -91,6 +91,7 @@ typedef struct irq_desc {
>      spinlock_t lock;
>      struct arch_irq_desc arch;
>      cpumask_var_t affinity;
> +    domid_t creator_domid; /* weak reference to domain handling this IRQ */

I feel like handling is too vague here, but I'm not a native speaker
so I'm not sure. I would maybe write:

... domain having permissions over this IRQ (which can be different
from the domain actually having the IRQ assigned) */

Which I think is less ambiguous.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.