[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.15 3/4] [RFC] xen/iommu: x86: Clear the root page-table before freeing the page-tables



Hi,

On 23/12/2020 16:11, Jan Beulich wrote:
On 23.12.2020 16:16, Julien Grall wrote:
On 23/12/2020 15:00, Jan Beulich wrote:
On 23.12.2020 15:56, Julien Grall wrote:
On 23/12/2020 14:12, Jan Beulich wrote:
On 22.12.2020 16:43, Julien Grall wrote:
This is an RFC because it would break AMD IOMMU driver. One option would
be to move the call to the teardown callback earlier on. Any opinions?

We already have

static void amd_iommu_domain_destroy(struct domain *d)
{
       dom_iommu(d)->arch.amd.root_table = NULL;
}

and this function is AMD's teardown handler. Hence I suppose
doing the same for VT-d would be quite reasonable. And really
VT-d's iommu_domain_teardown() also already has

       hd->arch.vtd.pgd_maddr = 0;

Let me have a look if that works.


I guess what's missing is prevention of the root table
getting re-setup.

This is taken care in the follow-up patch by forbidding page-table
allocation. I can mention it in the commit message.

My expectation is that with that subsequent change the change here
(or any variant of it) would become unnecessary.

I am not be sure. iommu_unmap() would still get called from put_page().
Are you suggesting to gate the code if d->is_dying as well?

Unmap shouldn't be allocating any memory right now, as in
non-shared-page-table mode we don't install any superpages
(unless I misremember).

It doesn't allocate memory, but it will try to access the IOMMU page-tables (see more below).


Even if this patch is deemed to be unecessary to fix the issue.
This issue was quite hard to chase/reproduce.

I think it would still be good to harden the code by zeroing
hd->arch.vtd.pgd_maddr to avoid anyone else wasting 2 days because the
pointer was still "valid".

But my point was that this zeroing already happens. What I
suspect is that it gets re-populated after it was zeroed,
because of page table manipulation that shouldn't be
occurring anymore for a dying domain.

AFAICT, the zeroing is happening in ->teardown() helper.

It is only called when the domain is fully destroyed (see call in arch_domain_destroy()). This will happen much after relinquishing the code.

Could you clarify why you think it is already zeroed and by who?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.