[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] interrupts: allow guest to set/clear MSI-X mask bit



On 14/08/13 17:30, Andrew Cooper wrote:
> On 14/08/13 17:18, Joby Poriyath wrote:
>> Guest needs the ability to enable and disable MSI-X interrupts
>> by setting the MSI-X control bit, for a passed-through device.
>> Guest is allowed to write MSI-X mask bit only if Xen *thinks*
>> that mask is clear (interrupts enabled). If the mask is set by
>> Xen (interrupts disabled), writes to mask bit by the guest is
>> ignored.
>>
>> Currently, a write to MSI-X mask bit by the guest is silently
>> ignored.
>>
>> A likely scenario is where we have a 82599 SR-IOV nic passed
>> through to a guest. From the guest if you do
>>
>>   ifconfig <ETH_DEV> down
>>   ifconfig <ETH_DEV> up
>>
>> the interrupts remain masked. On VF reset, the mask bit is set
>> by the controller. At this point, Xen is not aware that mask is set.
>> However, interrupts are enabled by VF driver by clearing the mask
>> bit by writing directly to BAR3 region containing the MSI-X table.
>>
>> From dom0, we can verify that
>> interrupts are being masked using 'xl debug-keys M'.
>>
>> Initially, guest was allowed to modify MSI-X bit.
>> Later this behaviour was changed.
>> See changeset 74c213c506afcd74a8556dd092995fd4dc38b225.
>>
>> Signed-off-by: Joby Poriyath <joby.poriyath@xxxxxxxxxx>
>> ---
>>  xen/arch/x86/hvm/vmsi.c |   47 
>> +++++++++++++++++++++++++++++++++--------------
>>  1 file changed, 33 insertions(+), 14 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
>> index 36de312..21421cc 100644
>> --- a/xen/arch/x86/hvm/vmsi.c
>> +++ b/xen/arch/x86/hvm/vmsi.c
>> @@ -169,6 +169,7 @@ struct msixtbl_entry
>>          uint32_t msi_ad[3]; /* Shadow of address low, high and data */
>>      } gentries[MAX_MSIX_ACC_ENTRIES];
>>      struct rcu_head rcu;
>> +    struct pirq *pirq;
>>  };
>>  
>>  static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock);
>> @@ -254,6 +255,9 @@ static int msixtbl_write(struct vcpu *v, unsigned long 
>> address,
>>      void *virt;
>>      unsigned int nr_entry, index;
>>      int r = X86EMUL_UNHANDLEABLE;
>> +    unsigned long flags;
>> +    struct irq_desc *desc;
>> +    unsigned long orig;
> unsigned long flags, orig;
>
> To be more compact.
>
>>  
>>      if ( len != 4 || (address & 3) )
>>          return r;
>> @@ -283,22 +287,35 @@ static int msixtbl_write(struct vcpu *v, unsigned long 
>> address,
>>      if ( !virt )
>>          goto out;
>>  
>> -    /* Do not allow the mask bit to be changed. */
>> -#if 0 /* XXX
>> -       * As the mask bit is the only defined bit in the word, and as the
>> -       * host MSI-X code doesn't preserve the other bits anyway, doing
>> -       * this is pointless. So for now just discard the write (also
>> -       * saving us from having to determine the matching irq_desc).
>> -       */
>> -    spin_lock_irqsave(&desc->lock, flags);
>> +    desc = pirq_spin_lock_irq_desc(entry->pirq, &flags);
>> +    if ( !desc )
>> +        goto out;
>> +
>> +    if ( !desc->msi_desc )
>> +        goto unlock;
>> +
>> +    /* Do not allow guest to modify MSIX control bit if it is masked 
>> +     * by Xen. We'll only handle the case where Xen thinks that
>> +     * bit is unmasked, but hardware has silently masked the bit
>> +     * (in case of SR-IOV VF reset, etc).
>> +     */
>> +    if ( desc->msi_desc->msi_attrib.masked )
>> +        goto unlock;
> If Xen wants the msi masked, or the guest wants the msi masked then you
> must set the masked bit, else must clear it.
>
> The root cause of this whole issue is that Xen doesn't actually know
> what state the mask bit is in; it only knows its intention.
>
> Therefore, goto unlock is incorrect here.  By this point, we must write
> the bit one way or another.
>
>> +
>> +    /* The mask bit is the only defined bit in the word. But we 
>> +     * ought to preserve the reserved bits. Clearing the reserved 
>> +     * bits can result in undefined behaviour (see PCI Local Bus
>> +     * Specification revision 2.3).
>> +     */
>>      orig = readl(virt);
>> -    val &= ~PCI_MSIX_VECTOR_BITMASK;
>> -    val |= orig & PCI_MSIX_VECTOR_BITMASK;
>> +    val &= PCI_MSIX_VECTOR_BITMASK;
>> +    val |= ( orig & ~PCI_MSIX_VECTOR_BITMASK );
>>      writel(val, virt);
>> -    spin_unlock_irqrestore(&desc->lock, flags);
>> -#endif
>>  
>> +unlock:
>> +    spin_unlock_irqrestore(&desc->lock, flags);
>>      r = X86EMUL_OKAY;
>> +
>>  out:
>>      rcu_read_unlock(&msixtbl_rcu_lock);
>>      return r;
>> @@ -328,7 +345,8 @@ const struct hvm_mmio_handler msixtbl_mmio_handler = {
>>  static void add_msixtbl_entry(struct domain *d,
>>                                struct pci_dev *pdev,
>>                                uint64_t gtable,
>> -                              struct msixtbl_entry *entry)
>> +                              struct msixtbl_entry *entry,
>> +                              struct pirq *pirq)
> I would advocate const-correctness here, so "const struct pirq *pirq".

Sorry - please ignore this.  I was being an idiot.

The other points still stand.

~Andrew

>
> ~Andrew
>
>>  {
>>      u32 len;
>>  
>> @@ -342,6 +360,7 @@ static void add_msixtbl_entry(struct domain *d,
>>      entry->table_len = len;
>>      entry->pdev = pdev;
>>      entry->gtable = (unsigned long) gtable;
>> +    entry->pirq = pirq;
>>  
>>      list_add_rcu(&entry->list, &d->arch.hvm_domain.msixtbl_list);
>>  }
>> @@ -404,7 +423,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq 
>> *pirq, uint64_t gtable)
>>  
>>      entry = new_entry;
>>      new_entry = NULL;
>> -    add_msixtbl_entry(d, pdev, gtable, entry);
>> +    add_msixtbl_entry(d, pdev, gtable, entry, pirq);
>>  
>>  found:
>>      atomic_inc(&entry->refcnt);
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.