[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of quarantined devices optional



On 10.03.2020 17:05, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: 10 March 2020 15:44
>> To: paul@xxxxxxx
>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; 'Tian, Kevin' <kevin.tian@xxxxxxxxx>; 
>> 'Andrew Cooper'
>> <andrew.cooper3@xxxxxxxxxx>
>> Subject: Re: [Xen-devel] [PATCH v3] IOMMU: make DMA containment of 
>> quarantined devices optional
>>
>> On 10.03.2020 16:13, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>> Sent: 10 March 2020 15:05
>>>> To: paul@xxxxxxx
>>>> Cc: 'Tian, Kevin' <kevin.tian@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx; 
>>>> 'Andrew Cooper'
>>>> <andrew.cooper3@xxxxxxxxxx>
>>>> Subject: Re: [PATCH v3] IOMMU: make DMA containment of quarantined devices 
>>>> optional
>>>>
>>>> On 10.03.2020 13:30, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>>>> Sent: 10 March 2020 10:27
>>>>>> To: Tian, Kevin <kevin.tian@xxxxxxxxx>; Paul Durrant <paul@xxxxxxx>
>>>>>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Andrew Cooper 
>>>>>> <andrew.cooper3@xxxxxxxxxx>
>>>>>> Subject: Re: [PATCH v3] IOMMU: make DMA containment of quarantined 
>>>>>> devices optional
>>>>>>
>>>>>> On 10.03.2020 04:43, Tian, Kevin wrote:
>>>>>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>>>>>> Sent: Monday, March 9, 2020 7:09 PM
>>>>>>>>
>>>>>>>> I'm happy to take better suggestions to replace the "full" command line
>>>>>>>> option and Kconfig prompt tokens. I don't think though that "fault" and
>>>>>>>> "write-fault" are really suitable there.
>>>>>>>
>>>>>>> I think we may just allow both r/w access to scratch page for such bogus
>>>>>>> device, which may make 'full' more reasonable since we now fully
>>>>>>> contain in-fly DMAs. I'm not sure about the value of keeping write-fault
>>>>>>> alone for such devices (just because one observed his specific device 
>>>>>>> only
>>>>>>> has problem with read-fault).
>>>>>>
>>>>>> Well, a fundamental problem I have here is that I still don't know
>>>>>> the _exact_ conditions for the observed hangs. I consider it unlikely
>>>>>> for IOMMU read faults to cause hangs, but for write faults to be
>>>>>> "fine".
>>>>>
>>>>> AFAIK it's because the writes are posted and so any faults are just 
>>>>> ignored, whereas a read fault
>>>> being synchronous causes the device's state machine to lock up. It really 
>>>> is observed behaviour.
>>>>>
>>>>>> It would seem more likely to me that e.g. a non-present
>>>>>> context entry might cause issues. If that was the case, we wouldn't
>>>>>> need to handle reads and writes differently; we could instead install
>>>>>> an all zero top level page table. And we'd still get all faults that
>>>>>> are supposed to surface. But perhaps Paul did try this back then, and
>>>>>> it turned out to not be an option.
>>>>>>
>>>>>
>>>>> The only info I had was that faults on DMA reads had to avoided
>>>>> completely. I did not have access to the h/w in question at the
>>>>> time. I may be able to get it now.
>>>>
>>>> I see. The implication then is, as Kevin said, that we mustn't run
>>>> guests with _any_ IOMMU PTEs having their "read" bits clear.
>>>> Anything that's "not present" now would need directing to a scratch
>>>> page. I then further wonder what effect reads to addresses beyond
>>>> AGAW would have. It may be impossible to arrange for sufficiently
>>>> secure pass-through with such a device, at which point - again as
>>>> said by Kevin - there may be little point in the scratch page
>>>> based quarantining.
>>>>
>>>
>>> Well, I can't say there's little point in it as it does fix a host lock-up.
>>>
>>> You say "as Kevin said, that we mustn't run guests with _any_ IOMMU
>>> PTEs having their "read" bits clear"... I can't find that. I did
>>> find where he said "In concept any IOMMU page table (dom0, dom_io
>>> or domU) for such bogus device should not include invalid entry",
>>> but that's a different thing.
>>
>> In which way?
> 
> In that the PTE would still be a valid entry? It would have read
> perm clear, yes, but that doesn't make the PTE invalid.

It was my understanding that Kevin meant "invalid" to represent
both "read" and "write" clear.

>>> However, is a really saying that things will break if any of the
>>> PTEs has their present bit clear?
>>
>> Well, you said that read faults are fatal (to the host). Reads will,
>> for any address with an unpopulated PTE, result in a fault and hence
>> by implication be fatal.
> 
> Oh I see. I thought there was an implication that the IOMMU could
> not cope with non-present PTEs in some way.

Well, that's what you were telling me. Or what I understood of what
you were saying.

> Agreed that, when the device is assigned to the guest, then it can
> arrange (via ballooning) for a non-present entry to be hit by a
> read transaction, resulting in a lock-up.

Not just ballooning, but simply by programming a bogus address into
whatever controls the device's DMA operation.

> But dealing with a
> malicious guest was not the issue at hand... dealing with a buggy
> device that still tried to DMA after reset and whilst in quarantine
> was the problem.

Sure, but assigning such a buggy device to a guest is suspicious in
the first place. Sure, if you trust your guest (including it being
bug free) then all you want is for the system to remain stable after
the guest died.

>> (As an aside, other than in x86's CPU page tables, IOMMU page tables
>> in both their AMD and Intel incarnations don't have "present" bits -
>> they have "read" and "write" ones only.)
>>
> 
> Oh, ok. I must have misunderstood the purpose of the 'pr' bit in
> the AMD IOMMU PTE then.

Oh, no - I overlooked it, applying too much VT-d to that code. I'm
sorry.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.