[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings



>>> On 12.09.18 at 10:02, <Paul.Durrant@xxxxxxxxxx> wrote:
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: 12 September 2018 08:04
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@xxxxxxxxxx> wrote:
>> > +static int iommuop_map(struct xen_iommu_op_map *op)
>> > +{
>> > +    struct domain *d, *currd = current->domain;
>> > +    struct domain_iommu *iommu = dom_iommu(currd);
>> > +    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
>> > +    bfn_t bfn = _bfn(op->bfn);
>> > +    struct page_info *page;
>> > +    unsigned int prot;
>> > +    int rc, ignore;
>> > +
>> > +    if ( op->pad ||
>> > +         (op->flags & ~(XEN_IOMMUOP_map_all |
>> > +                        XEN_IOMMUOP_map_readonly)) )
>> > +        return -EINVAL;
>> > +
>> > +    if ( !iommu->iommu_op_ranges )
>> > +        return -EOPNOTSUPP;
>> > +
>> > +    /* Per-device mapping not yet supported */
>> > +    if ( !(op->flags & XEN_IOMMUOP_map_all) )
>> > +        return -EINVAL;
>> > +
>> > +    /* Check whether the specified BFN falls in a reserved region */
>> > +    if ( rangeset_contains_singleton(iommu->reserved_ranges,
>> bfn_x(bfn)) )
>> > +        return -EINVAL;
>> > +
>> > +    d = rcu_lock_domain_by_any_id(op->domid);
>> > +    if ( !d )
>> > +        return -ESRCH;
>> > +
>> > +    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
>> > +    if ( rc )
>> > +        goto unlock;
>> > +
>> > +    rc = -EINVAL;
>> > +    if ( !readonly && !get_page_type(page, PGT_writable_page) )
>> > +    {
>> > +        put_page(page);
>> > +        goto unlock;
>> > +    }
>> > +
>> > +    prot = IOMMUF_readable;
>> > +    if ( !readonly )
>> > +        prot |= IOMMUF_writable;
>> > +
>> > +    rc = -EIO;
>> > +    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
>> > +        goto release;
>> > +
>> > +    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
>> > +    if ( rc )
>> > +        goto unmap;
>> 
>> When a mapping is requested for the same BFN that a prior mapping
>> was already established for, the page refs of that prior mapping get
>> leaked here. I don't think you want to require an intermediate unmap,
>> so checking the rangeset first is not an option. Hence I think you
>> need to look up the translation anyway, which may mean that the
>> rangeset's usefulness is quite limited (relevant with the additional
>> context of my question regarding it perhaps requiring a pretty much
>> unbounded amount of memory).
>> 
> 
> Yes, that's a good point. I could do a lookup to check whether the B[D]FN is 
> already there though. Agreed that the memory is unounded unless the number of 
> ranges is limited, which there is already a facility for. It is not ideal 
> though.
> 
>> In order to avoid shooting down all pre-existing RAM mappings - is
>> there no way the page table entries could be marked to identify
>> their origin?
>> 
> 
> I don't know whether that is possible; I'll have to find specs for Intel and 
> AMD IOMMUs and see if they have PTE bits available for such a use.

I seem to vaguely recall the AMD side lacking any suitable bits.

>> I also have another more general concern: Allowing the guest to
>> manipulate its IOMMU page tables means that it can deliberately
>> shatter large pages, growing the overall memory footprint of the
>> domain. I'm hesitant to say this, but I'm afraid that resource
>> tracking of such "behind the scenes" allocations might be a
>> necessary prereq for the PV IOMMU work.
> 
> Remember that PV-IOMMU is only available for dom0 as it stands (and that is 
> the only use-case that XenServer currently has) so I think that, whilst the 
> concern is valid, there is no need danger in putting the code without such 
> tracking. Such work can be deferred to making PV-IOMMU for de-privileged 
> guests... if that facility is needed.

Good point, but perhaps worth a prominent fixme note at a suitable
place in code, like iirc Roger has done in a number of places for his
vPCI work (which eventually we will want to extend to support DomU
too).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.