[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8 4/7] VT-d: Use one function to update both remapped and posted IRTE
On Wed, Feb 22, 2017 at 02:10:25AM -0700, Jan Beulich wrote: >>>> On 22.02.17 at 02:53, <chao.gao@xxxxxxxxx> wrote: >> On Tue, Nov 22, 2016 at 05:58:56PM +0800, Jan Beulich wrote: >>>> @@ -637,7 +657,23 @@ static int msi_msg_to_remap_entry( >>>> remap_rte->address_hi = 0; >>>> remap_rte->data = index - i; >>>> >>>> - memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry)); >>>> + if ( !pi_desc ) >>>> + memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry)); >>>> + else >>>> + { >>>> + __uint128_t ret; >>>> + >>>> + old_ire = *iremap_entry; >>>> + ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire); >>>> + >>>> + /* >>>> + * In the above, we use cmpxchg16 to atomically update the >>>> 128-bit IRTE, >>>> + * and the hardware cannot update the IRTE behind us, so the >>>> return value >>>> + * of cmpxchg16 should be the same as old_ire. This ASSERT >>>> validate it. >>>> + */ >>>> + ASSERT(ret == old_ire.val); >>>> + } >>> >>>Could you remind me again please why posted format updates need >>>to use cmpxchg16b, but remapped format ones don't? (As a aside, >>>with the code structure as you have it you should move the old_irte >>>declaration here, or omit it altogether, as you could as well pass >>>*iremap_entry directly afaict.) >> >> Before feng left, I have asked him about this question. He told me that >> the PDA field of posted format IRTE comprises of two parts: >> Posted Descritor Address High[127:96] and Low [63:38]. If we want to update >> PDA field, do it atomically or disable-update-enable. He also said, it had >> been confirmed that cmpxchg16b was supported on all intel platform with VT-d >> PI. >> If we assume that updates to remapped format IRTE only is to update either >> 64 bit or high 64 bit (except initialition), two 64bit memory write >> operations >> is enough. > >Two 64-bit memory write operations? Where do you see them? I >only see memcpy(), which for the purposes of the code here is >supposed to be a black box. Ok. I made a mistake here. In ioapic case, before update IRTE, the according IOAPIC RTE is masked. So, using a memcpy() is safe. In msi case, there is no mask operation. I think only using a memcpy() is unsafe. Do you think so? > >>>> @@ -668,7 +704,8 @@ int msi_msg_write_remap_rte( >>>> >>>> drhd = pdev ? acpi_find_matched_drhd_unit(pdev) >>>> : hpet_to_drhd(msi_desc->hpet_id); >>>> - return drhd ? msi_msg_to_remap_entry(drhd->iommu, pdev, msi_desc, msg) >>>> + return drhd ? msi_msg_to_remap_entry(drhd->iommu, pdev, >>>> + msi_desc, msg, NULL, 0) >>> >>>Is this unconditional passing of NULL here really correct? >> >> Since two parameters are added to this function, we should think about what >> the function does again. the last 2 parameters are optional. >> >> If they are not present, just means a physical device driver changes its msi >> message. So it notifies iommu to do some changes to IRTE accordingly (the >> driver doesn't >> know the format of the live IRTE). This is the case above. >> >> If they are present, it means the msi should be delivered to the vcpu with >> the >> vector num. To achieve that, the function replaces the old IRTE with a new >> posted format IRTE. > >I don't see how this answers my question. In fact it feels like you, >just like Feng, are making assumptions on the conditions under >which the function here is being called _at present_. You should, >however, make the function work correctly for all possible uses, >or add ASSERT()s to clearly expose issues with potential new, >future callers. Ok. Your suggestion is very good. Every caller tells the function to construct a centain format IRTE. Return to your question, I think it is defintely wrong to pass NULL unconditionally. We should pass NULL before the msi is binded to a guest interrupt and pass the destination vcpu and vector num after that. At this moment, I can't come up with a way to check whether the msi is binded to a guest interrupt and get the destination vcpu and vector num only through the struct pci_dev and struct msi_desc. Could you give me some suggestion on that or recommend a structure, you think, in which we can add some fields to record these information? Thanks, Chao > >>>> @@ -992,35 +986,11 @@ int pi_update_irte(const struct vcpu *v, const >>>> struct pirq *pirq, >>>> return -ENODEV; >>>> >>>> iommu = drhd->iommu; >>>> - ir_ctrl = iommu_ir_ctrl(iommu); >>>> - if ( !ir_ctrl ) >>>> + if ( !iommu_ir_ctrl(iommu) ) >>>> return -ENODEV; >>>> >>>> - spin_lock_irq(&ir_ctrl->iremap_lock); >>>> - >>>> - GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, remap_index, iremap_entries, >>>> p); >>>> - >>>> - old_ire = *p; >>>> - >>>> - /* Setup/Update interrupt remapping table entry. */ >>>> - setup_posted_irte(&new_ire, &old_ire, pi_desc, gvec); >>>> - ret = cmpxchg16b(p, &old_ire, &new_ire); >>>> - >>>> - /* >>>> - * In the above, we use cmpxchg16 to atomically update the 128-bit >>>> IRTE, >>>> - * and the hardware cannot update the IRTE behind us, so the return >>>> value >>>> - * of cmpxchg16 should be the same as old_ire. This ASSERT validate >>>> it. >>>> - */ >>>> - ASSERT(ret == old_ire.val); >>>> - >>>> - iommu_flush_cache_entry(p, sizeof(*p)); >>>> - iommu_flush_iec_index(iommu, 0, remap_index); >>>> - >>>> - unmap_vtd_domain_page(iremap_entries); >>>> - >>>> - spin_unlock_irq(&ir_ctrl->iremap_lock); >>>> - >>>> - return 0; >>>> + return msi_msg_to_remap_entry(iommu, pci_dev, msi_desc, >>>> &msi_desc->msg, >>>> + pi_desc, gvec); >>> >>>There are few changes here which appear to have the potential of >>>affecting behavior: Previously you didn't alter msi_desc or the MSI >>>message contained therein (as documented by the pointer having >>>been const). Is this possible updating of message and remap index >>>really benign? In any event any such changes should be reasoned >>>about in the commit message. >> >> I agree that we can't update message and remap index in this pi_update_irte. >> but msi_msg_to_remap_entry will change msi_desc when msi_desc->remap_index < >> 0. >> How about splitting part of msi_msg_to_remap_entry to a new function which >> consumes a const >> msi_desc parameter and pi_update_irte will call the new function? > >Well, I can't easily say yes or no here without seeing what the >result would be. Give it a try, and we'll look at the result in v9. > >Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |