[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.15 4/4] xen/iommu: x86: Don't leak the IOMMU page-tables



Hi Jan,

On 15/01/2021 11:24, Jan Beulich wrote:
On 14.01.2021 19:53, Julien Grall wrote:
On 23/12/2020 16:35, Jan Beulich wrote:
On 23.12.2020 17:19, Julien Grall wrote:
On 23/12/2020 16:15, Jan Beulich wrote:
On 23.12.2020 17:07, Julien Grall wrote:
I think I prefer the small race introduced (the pages will still be
freed) over this solution.

The "will still be freed" is because of the 2nd round of freeing
you add in an earlier patch? I'd prefer to avoid the race to in
turn avoid that 2nd round of freeing.

The "2nd round" of freeing is necessary at least for the domain creation
failure case (it would be the 1rst round).

If we can avoid IOMMU page-table allocation in the domain creation path,
then yes I agree that we want to avoid that "2nd round". Otherwise, I
think it is best to take advantage of it.

Well, I'm not really certain either way here. If it's really just
VMX'es APIC access page that's the problem here, custom cleanup
of this "fallout" from VMX code would certainly be an option.

  From my testing, it looks like the VMX APIC page is the only entry
added while the domain is created.

Furthermore I consider it wrong to insert this page in the IOMMU
page tables in the first place. This page overlaps with the MSI
special address range, and hence accesses will be dealt with by
interrupt remapping machinery (if enabled). If interrupt
remapping is disabled, this page should be no different for I/O
purposes than all other pages in this window - they shouldn't
be mapped at all.

That's a fair point. I will have a look to see if I can avoid the IOMMU
mapping for the VMX APIC page.

Perhaps, along the lines of epte_get_entry_emt(), ept_set_entry()
should gain an is_special_page() check to prevent propagation to
the IOMMU for such pages (we can't do anything about them in
shared-page-table setups)? See also the (PV related) comment in
cleanup_page_mappings(), a few lines ahead of a respective use of
is_special_page(),

There seems to be a mismatch between what the comment says and the code
adding the page in the IOMMU for PV domain.

AFAICT, the IOMMU mapping will be added via _get_page_type():

          /* Special pages should not be accessible from devices. */
          struct domain *d = page_get_owner(page);

          if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
          {
              mfn_t mfn = page_to_mfn(page);

              if ( (x & PGT_type_mask) == PGT_writable_page )
                  rc = iommu_legacy_unmap(d, _dfn(mfn_x(mfn)),
                                          1ul << PAGE_ORDER_4K);
              else
                  rc = iommu_legacy_map(d, _dfn(mfn_x(mfn)), mfn,
                                        1ul << PAGE_ORDER_4K,
                                        IOMMUF_readable | IOMMUF_writable);
          }

In this snippet, "special page" is interpreted as a page with no owner.
However is_special_page() will return true when PGC_extra is set and the
flag implies there is an owner (see assign_pages()).

So it looks like to me, any pages with PGC_extra would end up to have a
mapping in the IOMMU pages-tables if they are writable.

This statement also seems to apply for xenheap pages shared with a
domain (e.g grant-table).

Did I miss anything?

First let me point out that what you quote is not what I had
pointed you at - you refer to _get_page_type() when I suggested
to look at cleanup_page_mappings().
I know and that's why I pointed out the mismatch between the comments (in cleanup_page_mappings()) and the code adding the mapping in the IOMMU (_get_page_type()).

I only looked at _get_page_type() because I wanted to understand how the IOMMU mapping works for PV.

The important aspect for
special pages (here I mean ones having been subject to
share_xen_page_with_guest()) is that their type gets "pinned",
i.e. they'll never be subject to _get_page_type()'s
transitioning of types. As you likely will have noticed,
cleanup_page_mappings() also only gets called when the last
_general_ ref of a page got dropped, i.e. entirely unrelated
to type references.

Ah, that makes sense. I added some debugging in the code and couldn't really figure out why the transition wasn't happening.


If PGC_extra pages have a similar requirement, they may need
similar pinning of their types. (Maybe they do; didn't check.)

I have only looked at the VMX APIC page so far. It effectively pin the types.

Thanks for the explanation!

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.