[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.15 3/4] [RFC] xen/iommu: x86: Clear the root page-table before freeing the page-tables





On 23/12/2020 16:24, Jan Beulich wrote:
On 23.12.2020 17:16, Julien Grall wrote:
On 23/12/2020 16:11, Jan Beulich wrote:
On 23.12.2020 16:16, Julien Grall wrote:
On 23/12/2020 15:00, Jan Beulich wrote:
On 23.12.2020 15:56, Julien Grall wrote:
On 23/12/2020 14:12, Jan Beulich wrote:
On 22.12.2020 16:43, Julien Grall wrote:
This is an RFC because it would break AMD IOMMU driver. One option would
be to move the call to the teardown callback earlier on. Any opinions?

Please note this (in your original submission). I simply ...

We already have

static void amd_iommu_domain_destroy(struct domain *d)
{
        dom_iommu(d)->arch.amd.root_table = NULL;
}

and this function is AMD's teardown handler. Hence I suppose
doing the same for VT-d would be quite reasonable. And really
VT-d's iommu_domain_teardown() also already has

        hd->arch.vtd.pgd_maddr = 0;

Let me have a look if that works.


I guess what's missing is prevention of the root table
getting re-setup.

This is taken care in the follow-up patch by forbidding page-table
allocation. I can mention it in the commit message.

My expectation is that with that subsequent change the change here
(or any variant of it) would become unnecessary.

I am not be sure. iommu_unmap() would still get called from put_page().
Are you suggesting to gate the code if d->is_dying as well?

Unmap shouldn't be allocating any memory right now, as in
non-shared-page-table mode we don't install any superpages
(unless I misremember).

It doesn't allocate memory, but it will try to access the IOMMU
page-tables (see more below).


Even if this patch is deemed to be unecessary to fix the issue.
This issue was quite hard to chase/reproduce.

I think it would still be good to harden the code by zeroing
hd->arch.vtd.pgd_maddr to avoid anyone else wasting 2 days because the
pointer was still "valid".

But my point was that this zeroing already happens.
What I
suspect is that it gets re-populated after it was zeroed,
because of page table manipulation that shouldn't be
occurring anymore for a dying domain.

AFAICT, the zeroing is happening in ->teardown() helper.

It is only called when the domain is fully destroyed (see call in
arch_domain_destroy()). This will happen much after relinquishing the code.

Could you clarify why you think it is already zeroed and by who?

... trusted you on what you stated there. But perhaps I somehow
misunderstood that sentence to mean you want to put your addition
into the teardown functions, when apparently you meant to invoke
them earlier in the process. Without clearly identifying why this
would be a safe thing to do, I couldn't imagine that's what you
suggest as alternative.

This was a wording issue. I meant moving ->teardown() before (or calling from) iommu_free_pgtables().

Shall I introduce a new callback then?

This is because the interdependencies of
the IOMMU code are pretty hard to follow at times, and hence any
such re-ordering has a fair risk of breaking something elsewhere.

Right, this is another reason to try to get most of my fix self-contained rather relying on top-layer call to protect against a domain dying.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.