[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17 5/6] pci: do not disable memory decoding for devices


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 24 Oct 2022 18:24:15 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZigO/gQjqVWEg8aaqoQcxIujc0MZwkn33D5jwnxXdxY=; b=aXNK6XDUx5es8UYhEcb8WGtJ3mJPAObX/nrWtmRT6twDwPh8/qNmaoNPEl6EcuZRCABALJ/RGHcvIJZElRWIb0qlIXm1LfITgDBX+RbCPTEl+/vRFfBAEIz/zEPYVbRGuRXl0Kbxu4q1yl1hr+bo2mPWnySvegtoKnwYjDICfwykrNeZFHeGPiaavyrUEVbgXCoyMkZ8n6Q4qiNAkjhCFwHxLOl60x3gzws0qpDkD6oRWLTyh5HRtz1pBBrD2X6vfGxIdeMhkRYwOmaILwflnCG7weT8+XmpBpZQTWumJ12YYgcQMwnnnE6r3MQbr+mj29INnwEo941d2ODVyHoYRQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bEDCm000oDcd8jeBJUXuhJZg15pweHbaneceIMqAqgA0zl7lxAubxiqbFXcDZv+ac7aEWFu3mERa12dpuPerwHo3qSDwCE5Xcxz7woqr0u0vaTn03sImSa9jKjozDZlmSrLotHKOA4f8YYJBcxm+iFVgC8QILhqzwt/798o7jqtA03UUBdNej7pJHaoZRuktoHSGkYNsDImo2I1XA8ktsM1gVFtnr+5ughD+jpzxjTVYDEleMFNcnB16D0p82ZEvcYDZnTTt5WcZUwnQMJ/HOcYJtT+zMrANnBfm05eskuMoDravBKMcoIo3ABXM5I5nzwma+hVLqYMo3/yDgWh0+w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Mon, 24 Oct 2022 16:24:38 +0000
  • Ironport-data: A9a23:ZshDnqlOvQp3IuO4/EjzrWLo5gybJ0RdPkR7XQ2eYbSJt1+Wr1Gzt xIWXGHVb/6CYWX1Kop+aYm+pxxUupSGndMwSFBr/y81QyMWpZLJC+rCIxarNUt+DCFhoGFPt JxCN4aafKjYaleG+39B55C49SEUOZmgH+a6UqicUsxIbVcMYD87jh5+kPIOjIdtgNyoayuAo tq3qMDEULOf82cc3lk8tuTS9XuDgNyo4GlC5wRkOagR1LPjvyJ94Kw3dPnZw0TQGuG4LsbiL 87fwbew+H/u/htFIrtJRZ6iLyXm6paLVeS/oiI+t5qK23CulQRrukoPD9IOaF8/ttm8t4sZJ OOhF3CHYVxB0qXkwIzxWvTDes10FfUuFLTveRBTvSEPpqFvnrSFL/hGVSkL0YMkFulfAWdxr NhbDTc0ZAGPiNqz+ZiWUfJ+r5F2RCXrFNt3VnBI6xj8VKxjZK+ZBqLA6JlfwSs6gd1IEbDGf c0FZDFzbRPGJRpSJlMQD5F4l+Ct7pX9W2QA9BTJ+uxpvS6Pk2Sd05C0WDbRUsaNSshP2F6Ru 0rN/njjAwFcP9uaodaA2iL31raTxXuhMG4UPKTg2qNGkV6O/WhQJU0dclLlo/mfkHfrDrqzL GRRoELCt5Ma9kamU938VB2Qu2Ofs1gXXN84O/037kSBx7TZ5y6dB3MYVXhRZdo+rsg0SDc2k FiTkLvBHTVytJWFRHTb8a2bxQ5eIgAQJG4GICobFw0M5oC5pJlp1k6QCNF+DKSyk9v5Xynqx CyHpzQ/gLNVitMX06K8/hbMhDfESoX1czPZLz7/BgqNhj6Vrqb/D2B0wTA3Ncp9Ebs=
  • Ironport-hdrordr: A9a23:jyxq/KBhS4g3ozPlHeg3sceALOsnbusQ8zAXPh9KJCC9I/bzqy nxpp8mPH/P5wr5lktQ/OxoHJPwOU80kqQFmrX5XI3SJTUO3VHFEGgM1+vfKlHbak7DH6tmpN 1dmstFeaLN5DpB/KHHCWCDer5PoeVvsprY49s2p00dMT2CAJsQizuRZDzrcHGfE2J9dOcE/d enl7x6jgvlXU5SQtWwB3EDUeSGj9rXlKj+aRpDIxI88gGBgR6h9ba/SnGjr18jegIK5Y1n3X nOkgT/6Knmm/anyiXE32uWy5hNgtPuxvZKGcTJoMkILTfHjBquee1aKvS/lQFwhNvqxEchkd HKrRtlF8Nv60nJdmXwmhfp0xmI6kda11bSjXujxVfzq83wQzw3T+Bbg5hCTxff4008+Plhza NixQuixtZqJCKFuB64y8nDVhlsmEbxi2Eli/Qvg3tWVpZbQKNNrLYY4FheHP47bW/HAbgcYa dT5fznlbdrmQvwVQGYgoAv+q3nYp0LJGbIfqBY0fblkAS/nxhCvjklLYIk7zU9HakGOul5Dt T/Q9pVfY51P74rhIJGdZM8qJiMexvwaCOJFl6uCnLaM4xCE07xivfMkcYIDaeRCdc18Kc=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Oct 24, 2022 at 05:56:25PM +0200, Jan Beulich wrote:
> On 24.10.2022 17:45, Roger Pau Monné wrote:
> > On Mon, Oct 24, 2022 at 03:59:18PM +0200, Jan Beulich wrote:
> >> On 24.10.2022 14:45, Roger Pau Monné wrote:
> >>> On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote:
> >>>> On 20.10.2022 11:46, Roger Pau Monne wrote:
> >>>>> Commit 75cc460a1b added checks to ensure the position of the BARs from
> >>>>> PCI devices don't overlap with regions defined on the memory map.
> >>>>> When there's a collision memory decoding is left disabled for the
> >>>>> device, assuming that dom0 will reposition the BAR if necessary and
> >>>>> enable memory decoding.
> >>>>>
> >>>>> While this would be the case for devices being used by dom0, devices
> >>>>> being used by the firmware itself that have no driver would usually be
> >>>>> left with memory decoding disabled by dom0 if that's the state dom0
> >>>>> found them in, and thus firmware trying to make use of them will not
> >>>>> function correctly.
> >>>>>
> >>>>> The initial intent of 75cc460a1b was to prevent vPCI from creating
> >>>>> MMIO mappings on the dom0 p2m over regions that would otherwise
> >>>>> already have mappings established.  It's my view now that we likely
> >>>>> went too far with 75cc460a1b, and Xen disabling memory decoding of
> >>>>> devices (as buggy as they might be) is harmful, and reduces the set of
> >>>>> hardware on which Xen works.
> >>>>>
> >>>>> This commits reverts most of 75cc460a1b, and instead adds checks to
> >>>>> vPCI in order to prevent misplaced BARs from being added to the
> >>>>> hardware domain p2m.
> >>>>
> >>>> Which makes me wonder: How do things work then? Dom0 then still can't
> >>>> access the BAR address range, can it?
> >>>
> >>> It does allow access on some situations where the previous arrangement
> >>> didn't work because it wholesale disabled memory decoding for the
> >>> device.
> >>>
> >>> So if it's only one BAR that's misplaced the rest will still get added
> >>> to the dom0 p2m and be accessible, because memory decoding won't be
> >>> turned off for the device.
> >>
> >> Right - without a per-BAR disable there can only be all or nothing. In
> >> the end if things work with this adjustment, the problem BAR cannot
> >> really be in use aiui. I wonder what you would propose we do if on
> >> another system such a BAR is actually in use.
> > 
> > dom0 would have to change the position of the BAR to a suitable place
> > and then use it.  Linux dom0 does already reposition bogus BARs of
> > devices.
> 
> Yet that still can't realistically work if the firmware expects the
> BAR at the address recorded in the EFI memory map entry.

I was thinking about the BAR at address 0, rather than the BAR in the
EfiMemoryMappedIO region.

dom0 OS would need to avoid moving it, but that would also apply when
running natively on the platform.  The behavior when running as a dom0
won't change vs the native behavior, which is what we should aim for.

> >>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably 
> >>>>> positioned')
> >>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> >>>>> ---
> >>>>> AT Citrix we have a system with a device with the following BARs:
> >>>>>
> >>>>> BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region
> >>>>> BAR [0, 0x1fff] -> not positioned, outside host bridge window
> >>>>>
> >>>>> And memory decoding enabled by the firmware.  With the current code
> >>>>> (or any of the previous fix proposals), Xen would still disable memory
> >>>>> decoding for the device, and the system will freeze when attempting to
> >>>>> set EFI vars.
> >>>>
> >>>> Isn't the latter (BAR at address 0) yet another problem?
> >>>
> >>> It's a BAR that hasn't been positioned by the firmware AFAICT.  Which
> >>> is a bug in the firmware but shouldn't prevent Xen from booting.
> >>>
> >>> In the above system address 0 is outside of the PCI host bridge
> >>> window, so even if we mapped the BAR and memory decoding for the
> >>> device was enabled accessing such BAR wouldn't work.
> >>
> >> It's mere luck I would say that in this case the BAR is outside the
> >> bridge's window. What if this was a device integrated in the root
> >> complex?
> > 
> > I would expect dom0 to reposition the BAR, but doesn't a root complex
> > also have a set of windows in decodes accesses from (as listed in ACPI
> > _CRS method for the device), and hence still need BARs to be
> > positioned at certain ranges in order to be accessible?
> 
> Possibly; I guess I haven't learned enough of how this works at the
> root complex. Yet still an unassigned BAR might end up overlapping a
> valid window.

Right, but if the BAR overlaps a valid window it could be seen as
correctly positioned, and in any case that would be for dom0 to deal
with.

What we care about is BARs no overlapping regions on the memory map,
so that we can setup a valid p2m for dom0.

> >>>> I have to admit
> >>>> that I'm uncertain in how far it is a good idea to try to make Xen look
> >>>> to work on such a system ...
> >>>
> >>> PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I
> >>> would consider 75cc460a1b to be a regression for PV dom0 setups.
> >>
> >> Agreed, in a way it is a regression. In another way it is deliberate
> >> behavior to not accept bogus configurations. The difficulty is to
> >> find a reasonable balance between allowing Xen to work in such cases
> >> and guarding Xen from suffering follow-on issues resulting from such
> >> misconfiguration. After all if this system later was impacted by the
> >> bad BAR(s), connecting the misbehavior to the root cause might end
> >> up quite a bit more difficult.
> > 
> > IMO we should strive to boot (almost?) everywhere Linux (or your
> > chosen dom0 OS) also boots, since that's what users expect.
> > 
> > I would assume if the system was impacted by the bad BARs, it would
> > also affect the dom0 OS when booting natively on such platform.
> > 
> > What we do right now with memory decoding already leads to a very
> > difficult to diagnose issue, as on the above example calling an UEFI
> > runtime method completely freezes the box (no debug keys, no watchdog
> > worked).
> > 
> > So I think leaving the system PCI devices as-is and letting dom0 deal
> > with the conflicts is likely a better option than playing with the
> > memory decoding bits.
> 
> Maybe. None of the workarounds really feel very good.

Hence this last suggestion which limits the workarounds to PVH dom0
only, thus limiting the interaction of Xen with PCI devices as much as
possible.  I think it's an appropriate compromise between being able
to boot as PVH dom0 and not playing with the PCI device memory
decoding bits.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.