[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [for-4.15][PATCH v2 3/5] xen/iommu: iommu_map: Don't crash the domain if it is dying



On 09.02.2021 22:14, Julien Grall wrote:
> On Tue, 9 Feb 2021 at 20:28, Paul Durrant <xadimgnik@xxxxxxxxx> wrote:
>>> From: Julien Grall <julien@xxxxxxx>
>>> Sent: 09 February 2021 15:28
>>>
>>> It is a bit pointless to crash a domain that is already dying. This will
>>> become more an annoyance with a follow-up change where page-table
>>> allocation will be forbidden when the domain is dying.
>>>
>>> Security wise, there is no change as the devices would still have access
>>> to the IOMMU page-tables even if the domain has crashed until Xen
>>> start to relinquish the resources.
>>>
>>> For x86, we rely on dom_iommu(d)->arch.mapping.lock to ensure
>>> d->is_dying is correctly observed (a follow-up patch will held it in the
>>> relinquish path).

Am I to understand this to mean that at this point of the series
things aren't really correct yet in this regard? If so, wouldn't
it be better to re-order?

>>> For Arm, there is still a small race possible. But there is so far no
>>> failure specific to a domain dying.
>>>
>>> Signed-off-by: Julien Grall <jgrall@xxxxxxxxxx>
>>>
>>> ---
>>>
>>> This was spotted when trying to destroy IOREQ servers while the domain
>>> is dying. The code will try to add the entry back in the P2M and
>>> therefore update the P2M (see arch_ioreq_server_disable() ->
>>> hvm_add_ioreq_gfn()).
>>>
>>> It should be possible to skip the mappin in hvm_add_ioreq_gfn(), however
>>> I didn't try a patch yet because checking d->is_dying can be racy (I
>>> can't find a proper lock).

I understand the concern. I find it odd though that we permit
iommu_map() to do anything at all when the domain is already
dying. So irrespective of the remark below, how about bailing
from iommu_map() earlier when the domain is dying?

>>> --- a/xen/drivers/passthrough/iommu.c
>>> +++ b/xen/drivers/passthrough/iommu.c
>>> @@ -272,7 +272,7 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
>>>                              flush_flags) )
>>>                  continue;
>>>
>>> -        if ( !is_hardware_domain(d) )
>>> +        if ( !is_hardware_domain(d) && !d->is_dying )
>>>              domain_crash(d);
>>
>> Would it make more sense to check is_dying inside domain_crash() (and turn 
>> it into a no-op in that case)?
> 
> Jan also suggested moving the check in domain_crash(). However, I felt
> it is potentially a too risky change for 4.15 as there are quite a few
> callers.

This is a fair point. However, in such a case I'd prefer symmetry
at least throughout this one source file (there are three more
places), unless there are strong reasons against doing so.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.