[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17 5/6] pci: do not disable memory decoding for devices


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 24 Oct 2022 17:56:25 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=KxL015ye25JaB+u+T4WKhG34ZmwUdObg3fYJa5grYxI=; b=VxW2/hYPHVtzyKkpY7fZ0+wCNEtcq5EXQPDzKENdRKaFKdek+WNwxNuPizMMCK4Ye6iKLPbUKHMnWqR2ghyRL/4CSCl6W2Og+QQZZBu7PJaFl/xwytCwKICcPUGChMau/TCwToohWRlOn4c79fFRYCF80bt867kUvCLSpCYPlBy63cnMsHvYFk/m+w86MVgqJHn9UrkJKXyuAqnmRQj53n7H4nVw7oMn/A7DrwV9fRZXKrpq5QCZpJDz+25G4Hcl62UufcUMWfD8SUA7M+vWYWhw+FCBsbX8Ic3pZH7o9sfqyrf1hbfNajuZgSF491EOuY6rH/n6uP/BhpJEPgBAuA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JVHdPBrUIYPqGe1pZ20jPUymUfe+9VNlCNEeavyb2JcLCQGzwLjET33fPO+Uc8HqslTThvmfN/FDgUAOJsYZ+IYrSY5mPET/bqTtFBPqMXFlUi1xGM3gHOQ4yXseXAY81UAkkz5tTk2xrnNijJU4UlHUgXyZpiIytniDWzjhpMDsFsn6a05mnnsVbvTJ6fNXbBEkoMwbKKJ5hQULZV1OVEgSHv6Ka9CT1jvqPKEqS2Lxe/djcKJu+l1KwkOBmH1trhV5AEImdr8728pKateAV9AhPYYuqjkX2oBbZ/Bih2yW12BGPfk5b6UE8g/AK+oCnTqFyrDCQYsqveKstAJ3Ug==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Mon, 24 Oct 2022 15:56:39 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 24.10.2022 17:45, Roger Pau Monné wrote:
> On Mon, Oct 24, 2022 at 03:59:18PM +0200, Jan Beulich wrote:
>> On 24.10.2022 14:45, Roger Pau Monné wrote:
>>> On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote:
>>>> On 20.10.2022 11:46, Roger Pau Monne wrote:
>>>>> Commit 75cc460a1b added checks to ensure the position of the BARs from
>>>>> PCI devices don't overlap with regions defined on the memory map.
>>>>> When there's a collision memory decoding is left disabled for the
>>>>> device, assuming that dom0 will reposition the BAR if necessary and
>>>>> enable memory decoding.
>>>>>
>>>>> While this would be the case for devices being used by dom0, devices
>>>>> being used by the firmware itself that have no driver would usually be
>>>>> left with memory decoding disabled by dom0 if that's the state dom0
>>>>> found them in, and thus firmware trying to make use of them will not
>>>>> function correctly.
>>>>>
>>>>> The initial intent of 75cc460a1b was to prevent vPCI from creating
>>>>> MMIO mappings on the dom0 p2m over regions that would otherwise
>>>>> already have mappings established.  It's my view now that we likely
>>>>> went too far with 75cc460a1b, and Xen disabling memory decoding of
>>>>> devices (as buggy as they might be) is harmful, and reduces the set of
>>>>> hardware on which Xen works.
>>>>>
>>>>> This commits reverts most of 75cc460a1b, and instead adds checks to
>>>>> vPCI in order to prevent misplaced BARs from being added to the
>>>>> hardware domain p2m.
>>>>
>>>> Which makes me wonder: How do things work then? Dom0 then still can't
>>>> access the BAR address range, can it?
>>>
>>> It does allow access on some situations where the previous arrangement
>>> didn't work because it wholesale disabled memory decoding for the
>>> device.
>>>
>>> So if it's only one BAR that's misplaced the rest will still get added
>>> to the dom0 p2m and be accessible, because memory decoding won't be
>>> turned off for the device.
>>
>> Right - without a per-BAR disable there can only be all or nothing. In
>> the end if things work with this adjustment, the problem BAR cannot
>> really be in use aiui. I wonder what you would propose we do if on
>> another system such a BAR is actually in use.
> 
> dom0 would have to change the position of the BAR to a suitable place
> and then use it.  Linux dom0 does already reposition bogus BARs of
> devices.

Yet that still can't realistically work if the firmware expects the
BAR at the address recorded in the EFI memory map entry.

>>>> Plus with this adjustment, is
>>>> ...
>>>>
>>>>>  Signaling on whether BARs are mapped is tracked
>>>>> in the vpci structure, so that misplaced BARs are not mapped, and thus
>>>>> Xen won't attempt to unmap them when memory decoding is disabled.
>>>>>
>>>>> This restores the behavior of Xen for PV dom0 to the state it was
>>>>> previous to 75cc460a1b, while also introducing a more contained fix
>>>>> for the vPCI BAR mapping issues.
>>>>
>>>> ... this (in particular "restores the behavior") a valid description
>>>> of this change?
>>>
>>> Yes, it restores the previous behavior for PV dom0, as memory decoding
>>> is no longer turned off for any devices regardless of where the BARs
>>> are positioned.
>>
>> It restores one aspect of behavior but then puts in place another
>> restriction.
> 
> I assume the other restriction is about moving the check to vPCI code
> rather than disabling memory decoding?
> 
> It restores the behavior for PV dom0, and keeps a more 'contained' fix
> for PVH dom0.

Ouch, I'm sorry: I didn't pay attention to the "restore" applying to PV
(the similarity of the acronyms made me read "PVH" despite it being "PV").

>>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably 
>>>>> positioned')
>>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>>>> ---
>>>>> AT Citrix we have a system with a device with the following BARs:
>>>>>
>>>>> BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region
>>>>> BAR [0, 0x1fff] -> not positioned, outside host bridge window
>>>>>
>>>>> And memory decoding enabled by the firmware.  With the current code
>>>>> (or any of the previous fix proposals), Xen would still disable memory
>>>>> decoding for the device, and the system will freeze when attempting to
>>>>> set EFI vars.
>>>>
>>>> Isn't the latter (BAR at address 0) yet another problem?
>>>
>>> It's a BAR that hasn't been positioned by the firmware AFAICT.  Which
>>> is a bug in the firmware but shouldn't prevent Xen from booting.
>>>
>>> In the above system address 0 is outside of the PCI host bridge
>>> window, so even if we mapped the BAR and memory decoding for the
>>> device was enabled accessing such BAR wouldn't work.
>>
>> It's mere luck I would say that in this case the BAR is outside the
>> bridge's window. What if this was a device integrated in the root
>> complex?
> 
> I would expect dom0 to reposition the BAR, but doesn't a root complex
> also have a set of windows in decodes accesses from (as listed in ACPI
> _CRS method for the device), and hence still need BARs to be
> positioned at certain ranges in order to be accessible?

Possibly; I guess I haven't learned enough of how this works at the
root complex. Yet still an unassigned BAR might end up overlapping a
valid window.

>>>> I have to admit
>>>> that I'm uncertain in how far it is a good idea to try to make Xen look
>>>> to work on such a system ...
>>>
>>> PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I
>>> would consider 75cc460a1b to be a regression for PV dom0 setups.
>>
>> Agreed, in a way it is a regression. In another way it is deliberate
>> behavior to not accept bogus configurations. The difficulty is to
>> find a reasonable balance between allowing Xen to work in such cases
>> and guarding Xen from suffering follow-on issues resulting from such
>> misconfiguration. After all if this system later was impacted by the
>> bad BAR(s), connecting the misbehavior to the root cause might end
>> up quite a bit more difficult.
> 
> IMO we should strive to boot (almost?) everywhere Linux (or your
> chosen dom0 OS) also boots, since that's what users expect.
> 
> I would assume if the system was impacted by the bad BARs, it would
> also affect the dom0 OS when booting natively on such platform.
> 
> What we do right now with memory decoding already leads to a very
> difficult to diagnose issue, as on the above example calling an UEFI
> runtime method completely freezes the box (no debug keys, no watchdog
> worked).
> 
> So I think leaving the system PCI devices as-is and letting dom0 deal
> with the conflicts is likely a better option than playing with the
> memory decoding bits.

Maybe. None of the workarounds really feel very good.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.