[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for-4.12] passthrough/vtd: Drop the "workaround_bios_bug" logic entirely


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 27 Mar 2019 14:38:57 +0000
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Juergen Gross <jgross@xxxxxxxx>, Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Lars Kurth <lars.kurth@xxxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Paul Durrant <paul.durrant@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Delivery-date: Wed, 27 Mar 2019 14:56:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 26/03/2019 13:39, Jan Beulich wrote:
>>>> On 26.03.19 at 13:43, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 26/03/2019 09:08, Jan Beulich wrote:
>>>>>> Leave the warning which identifies the problematic devices, but drop the
>>>>>> remaining logic.  This leaves the system in better overall state, and 
>>>>>> working
>>>>>> in the same way that it did in previous releases.
>>>>> I wonder whether you've taken the time to look at the description
>>>>> of the commit first introducing this logic (a8059ffced "VT-d: improve
>>>>> RMRR validity checking"). I find it worrying in particular to
>>>>> effectively revert a change which claims 'to avoid any security
>>>>> vulnerability with malicious s/s re-enabling "supposed disabled"
>>>>> devices' without any discussion of why that may have been a
>>>>> wrong perspective to take.
>>>> I had, and as a maintainer, I'd reject a patch like that were it
>>>> presented today.
>>> Understood. But whether you'd accept it with a better description
>>> is unknown, I assume.
>> I severely doubt I'd accept it at all, because it is entirely
>> unreasonable behaviour.
>>
>> At best, it is the equivalent of throwing your hands up in the air and
>> saying "I give up", and that is not good enough behaviour for Xen.
>>
>>>> There is a nebulous claim of security, but it is exactly that -
>>>> nebulous.  There isn't enough information to work out what the concern
>>>> was, and even if the concern was valid, disabling VT-d across the system
>>>> isn't an appropriate action to take.
>>> This heavily depends on the position the system's admin takes:
>>> Enabling VT-d in an incomplete fashion may as well be considered
>>> worse than not enabling it at all.
>> No - that's simply not true, or a reasonable position to take. 
> As is every way of thinking differently than you do?

No, but I do expect common sense to be used in the judgement of what is
appropriate and/or reasonable end user behaviour.

> I'm sorry to
> be putting it this way, but you continue to make claims about
> how people ought to think without giving any reason why that's
> the only valid way. I can't see anything wrong with someone
> putting themselves on the position that if they see an enabled
> IOMMU, they assume that pass-through is as safe as it can
> (currently) be.

Once again, we get back to a un-justified (and disputed) claim of
"security".

*What* is unsafe about having non-active devices behind an IOMMU?

How does this scenario differ from one of PCI hotplug where the device
really doesn't exist at boot time, and comes into existence later?

> Just to then be caught by surprise that there is
> a device not actually handled by any IOMMU? After all a non-
> existent device listed in a table may as well be a hint that it's
> just its SBDF which the firmware got wrong.

and where does Xen currently check this?  (this is a rhetorical question
- Xen cannot check this.)

There is absolutely nothing *at all* which guarantees that just because
a number of devices are identified to be behind specific IOMMUs, that
DMA wont start appearing from elsewhere in the system.

Security of the system when it comes to IOMMUs *is and always will be* a
mutually cooperative and trusting relationship between Xen and the firmware.

The notion of "I'm safe because there were no inconsistencies in a piece
of information I have to trust fully" is security theatre, not security.

>
>> Disabling the IOMMU prevents the system from booting with a PVH dom0.
> But doing what you did is not the only way of getting around this.
> Defaulting to workaround_bios_bug=1 in the PVH case would be
> another, as would be a mode in which the IOMMU exists for Dom0's
> purposes only (i.e. still disallowing any pass-through to DomU-s).

A discussion along these lines might be appropriate in the middle of a
dev cycle, and might even be valid for a discussion of future improvements.

It is not appropriate for resolving an issue identified as a 4.12
blocker by the RM, on a timescale which needs to fit into the 4.12
release plans.

>> I am not aware of a credible case where partially enabled VT-d is less
>> secure than no VT-d, and there is one headline case now where disabled
>> VT-d causes a failure to boot.
>>
>>> Furthermore, as much as the security related claim there is
>>> nebulous, your description - I'm sorry to say that - isn't much
>>> better, as you don't clarify why there's _no_ security aspect
>>> there. Stating that "this leaves the system in better overall
>>> state" without making clear why that is _for everyone_ is not
>>> helpful at all.
>> The nebulous security claim is not relevant to this patch.
>>
>> This code was not run previously.  An unexpected consequence of a change
>> in 4.12 caused it to run, and break booting on some (sadly rather
>> common) systems.
>>
>> This is a regression in 4.12 and needs resolving.  The choice is between
>> reverting dcf41790 or removing this code, and reverting dcf41790 is
>> obviously not a valid thing to do.
> As explained before, there was an earlier regression, which - if it
> had been noticed in time - would have made all versions from 4.2
> to 4.11 behave like 4.12 without your change.

And if you'd not broken the behaviour back in 4.2, this class of system
would have been discovered the first time someone tried booting Xen on
it, not at the point someone is trying to upgrade 4.11 to 4.12.

I also wouldn't be pushing this specific fix in this specific
situation.  (In reality, I'd have already pushed this fix when the bug
was originally reported, because its unreasonable behaviour no matter
the situation which calls for it to be modified.)

> This behavior was
> intended by the original author. Ripping the code out by convincing
> people to bypass normal review flow is, well, not very nice to put it
> mildly.

That again was explicitly called out, with an explanation of why I was
deviating from usual process, including an agreement from The Rest that
the exceptional circumstances were warranted.

>
>> Beyond that, I really don't care what the exact behaviour of 4.11 was. 
>> If there is a real security issue then it still needs fixing on all
>> versions of Xen, and this change doesn't alter that property.
>>
>> However, until someone can work out what the alleged issue is, we can't
>> really progress this argument, and we mustn't keep broken code simply
>> because it purports to "fix" an unspecified issue.
> You seem to forget that your change is to deal with one form of
> broken firmware. It is simply impossible to enumerate all ways in
> which firmware _might_ be broken. The original code allegedly
> tried to deal with some other form of firmware flaw.

There is no such thing as a non-buggy firmware, just like there is no
such thing as a bug-free Xen.

What matters is that Xen copes in a way which is not detrimental to a
user being able to reconfigure their system.

>
> Just like in the EFI case, where there's so much breakage, I do
> think that default behavior of software ought to be to assume
> sane firmware behavior, allowing for workarounds where
> needed. Unless positively identified to be needed on a system,
> and unless needed virually everywhere, such workarounds
> should not be enabled by default. That is, in the given case a
> DMI quirk could have been added enabling workaround_bios_bug
> by default for the R740.

I am not going to start this argument again.

Disagreements with all of these points have been raised by multiple
parties on xen-devel, and I find myself in a position where I don't have
anything civil to add.

>
>>>> I'm not sure what more you are looking for, but this is very clear cut
>>>> and safe from my point of view.
>>> Well, your claim regarding "4.11 and earlier" is clearly wrong
>> I have made a statement, backed up with specific reference to the code
>> which, to the best of my ability, demonstrates it to be true.
>>
>> If you believe contrary then clearly identify the fault in my reasoning.
> I did, by pointing out the earlier regression, which you elected to
> ignore altogether in your reply.

You identified why Xen 4.11 didn't behave in the way you expected.

In doing so, you also demonstrated why the commit message was, in fact,
correct.


Like other parts of this thread, it was deviating sufficiently
off-topic/relevance that I chose to trim it.  I will continue doing so
in an effort to reduce the amount of my time that this thread is
wasting.  I don't know for certain, but I expect you've also got better
things to do with your time than arguing over off-topic aspects of this
thread.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.