[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 5/8] VT-d: Introduce a new function update_irte_for_msi_common



>>> On 02.03.17 at 08:14, <chao.gao@xxxxxxxxx> wrote:
> On Thu, Mar 02, 2017 at 01:58:14AM -0700, Jan Beulich wrote:
>>>>> On 27.02.17 at 02:45, <chao.gao@xxxxxxxxx> wrote:
>>> +    if ( (!pi_desc && gvec) || (pi_desc && !gvec) )
>>
>>gvec == 0 alone is never a valid check: Either all vectors are valid,
>>or a whole range (e.g. 0x00...0x0f) is invalid. Furthermore I think
>>such checks are easier to read as either
> 
> How about only use pi_desc is NULL or not to decide the format of the IRTE?

That makes sense, I think.

>>> @@ -592,38 +693,33 @@ static int msi_msg_to_remap_entry(
>>>          return -EFAULT;
>>>      }
>>>  
>>> +    /* Get the IRTE's bind relationship with guest from the live IRTE. */
>>>      GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
>>>                       iremap_entries, iremap_entry);
>>> -
>>> -    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
>>> -
>>> -    /* Set interrupt remapping table entry */
>>> -    new_ire.remap.fpd = 0;
>>> -    new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
>>> -    new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
>>> -    new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
>>> -    /* Hardware require RH = 1 for LPR delivery mode */
>>> -    new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
>>> -    new_ire.remap.avail = 0;
>>> -    new_ire.remap.res_1 = 0;
>>> -    new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
>>> -                            MSI_DATA_VECTOR_MASK;
>>> -    new_ire.remap.res_2 = 0;
>>> -    if ( x2apic_enabled )
>>> -        new_ire.remap.dst = msg->dest32;
>>> +    if ( !iremap_entry->remap.im )
>>> +    {
>>> +        gvec = 0;
>>> +        pi_desc = NULL;
>>> +    }
>>>      else
>>> -        new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
>>> -                             & 0xff) << 8;
>>> +    {
>>> +        gvec = iremap_entry->post.vector;
>>> +        pi_desc = (void *)((((u64)iremap_entry->post.pda_h) << PDA_LOW_BIT 
>>> )
>>> +                           + iremap_entry->post.pda_l);
>>> +    }
>>> +    unmap_vtd_domain_page(iremap_entries);
>>
>>I don't follow: Why does it matter what the entry currently holds?
>>As I've pointed out more than once before (mainly to Feng), the
>>goal ought to be to produce the new entry solely based on what
>>the intended new state is, i.e. function input and global data.
>>
> 
> I think the function introduced by this patch is to produce the new
> entry solely based on input.

Which contradicts you reading the table entry.

> If someone wants to produce the new entry,
> it can call it directly. 
> 
> I want to explain why we read the entry.
> 
> msi_msg_to_remap_entry() can be called before a msi gets bound to a guest
> interrupt or after that. If we call the function without realizing the msi
> has been binded to a guest interrupt, the IRTE would be updated to a 
> remapped format breaking the binding (at least breaking the intention to use
> VT-d PI). I think this is a possible case in the current code. This patch 
> avoids
> this case and provides a new function to the callers who are intended to 
> replace
> a posted format IRTE with a remapped format IRTE. Reading this entry is to get
> the binding information and use it to update IRTE ( as comments in code, when
> the IRTE is in posted format, we can suppress the update since the content 
> of IRTE will not change for the binding information hasn't change. and Also 
> if the binding information changed, we should call pi_update_irte ).

Just to repeat what I've said a number of times before: Preferably
the new entry should be created from scratch, and all data needed
to do so should be passed in (or be derivable from passed in data,
e.g. by looking up global data structures _other than the IRT_ ).

> At this moment, we don't recognize any existing caller of
> msi_msg_to_remap_entry() needs to update a posted IRTE to a remapped IRTE.
> If the need emerges, we can expose the common function to the callers.

This restriction would go away with the function being fully self
contained, as re-explained above. The second best option (if
the preferred one is overly complicated to implement) is to make
the function actively refuse anything it can't currently handle. I
don't think that's the case yet with this version of the patch.

>>> @@ -996,31 +1046,11 @@ int pi_update_irte(const struct vcpu *v, const 
>>> struct pirq *pirq,
>>>      if ( !ir_ctrl )
>>>          return -ENODEV;
>>>  
>>> -    spin_lock_irq(&ir_ctrl->iremap_lock);
>>> -
>>> -    GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, remap_index, iremap_entries, 
>>> p);
>>> -
>>> -    old_ire = *p;
>>> -
>>> -    /* Setup/Update interrupt remapping table entry. */
>>> -    setup_posted_irte(&new_ire, &old_ire, pi_desc, gvec);
>>> -    ret = cmpxchg16b(p, &old_ire, &new_ire);
>>> -
>>> -    /*
>>> -     * In the above, we use cmpxchg16 to atomically update the 128-bit 
>>> IRTE,
>>> -     * and the hardware cannot update the IRTE behind us, so the return 
>>> value
>>> -     * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
>>> -     */
>>> -    ASSERT(ret == old_ire.val);
>>> -
>>> -    iommu_flush_cache_entry(p, sizeof(*p));
>>> -    iommu_flush_iec_index(iommu, 0, remap_index);
>>> -
>>> -    unmap_vtd_domain_page(iremap_entries);
>>> -
>>> -    spin_unlock_irq(&ir_ctrl->iremap_lock);
>>> -
>>> -    return 0;
>>> +    spin_lock_irqsave(&ir_ctrl->iremap_lock, flags);
>>> +    rc = update_irte_for_msi_common(iommu, pci_dev, msi_desc, NULL, 
>>> pi_desc,
>>> +                                    gvec);
>>> +    spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
>>> +    return rc;
>>
>>Considering the old code use spin_lock_irq() (and there's such left
>>also earlier in the function), why do you use the irqsave function
>>here?
> 
> Yes, it should be spin_lock_irq(). I saw it used in msi_msg_to_remap_entry() 
> so
> I followed it without digging deep into the difference. The spin_lock_irq()
> makes an assumption to the current context, so it's better.  

It's not necessarily better (or worse), but it's consistent with
neighboring code.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.