[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17] x86/efi: don't translate EFI memory map MMIO regions to e820 RESERVED


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 4 Oct 2022 12:40:10 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GUMUb4rDVIPGGhMJuXs1xXw3TYdDQJYRwxQy55Zh8MM=; b=Xt4qafiKxuhBvaa4TCH84HOP2kPulE0NDb/Wskoh0uVtd/pVDpo0A5njh82LLeVDWSc2MuaAnRwqWNpwSN34nI/BhWGZ67W67ASefsPGpJxPPa78c+bdsbx7D7Np0PUYmK2nFg7B8Wn47x/UTYh2crZRwkR6mumdSCp9IfU074Jon7KsXNxViPaKLw9b901XVxWYbtzl2f0+tLB6cU9hFUPzk0M3+LBzbnTZk70jZI4xRz8HsbfGUm2yuFYAqEWfZDAMZc8qLOLl1wpbTD4bkVDnVQInxYB7Ys7OGPuZMk0Ol9exHrg6bYrwWnI3vO2ITk4rqW72FaqeC8NAXsv2fA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B6v1xTPhIOBE4+aHCinAvbNInNYfx5DhEz/IxlzakKYX9H89aG6xgRGztNfsMXIFtgY+NA7naJYWRbvMUYs3/9ZUQM0sE6CntksAiWsqemQlXgazWKarFqXNAu3x5UGZHgZOMf4qNBn5lgBQBP35J5P+X7F/dgdK9dqmsaeW7O8emu29ztKEtjh4Wh1HUBuIdPGxB09Sv0JFLcvhhIOG0LvGyzaiO7kxIUCSzISS2bv0fIv6svDg/Dr3Cx1GRtBa5FfrqYm0//2ROCXodO6/0O9DYUU6/oXyg89Adcz1bm36hn3VwsHBg0+UPZT3UTKJF/rGdEmoEa13SFPA0ndyZw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 04 Oct 2022 10:40:18 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.10.2022 11:27, Roger Pau Monné wrote:
> On Tue, Oct 04, 2022 at 11:01:18AM +0200, Jan Beulich wrote:
>> On 30.09.2022 16:17, Roger Pau Monne wrote:
>>> The EFI memory map contains two memory types (EfiMemoryMappedIO and
>>> EfiMemoryMappedIOPortSpace) used to describe IO memory areas of
>>> devices used by EFI.
>>>
>>> The current parsing of the EFI memory map was translating
>>> EfiMemoryMappedIO and EfiMemoryMappedIOPortSpace to E820_RESERVED on
>>> x86.  This is an issue because device MMIO regions (BARs) should not
>>> be positioned on reserved regions.  Any BARs positioned on non-hole
>>> areas of the memory map will cause is_memory_hole() to return false,
>>> which would then cause memory decoding to be disabled for such device.
>>> This leads to EFI firmware malfunctions when using runtime services.
>>>
>>> The system under which this was observed has:
>>>
>>> EFI memory map:
>>> [...]
>>>  00000fd000000-00000fe7fffff type=11 attr=800000000000100d
>>> [...]
>>> 0000:00:1f.5 disabled: BAR [0xfe010, 0xfe010] overlaps with memory map
>>>
>>> The device behind this BAR is:
>>>
>>> 00:1f.5 Serial bus controller [0c80]: Intel Corporation Lewisburg SPI 
>>> Controller (rev 09)
>>>     Subsystem: Super Micro Computer Inc Device 091c
>>>     Flags: fast devsel
>>>     Memory at fe010000 (32-bit, non-prefetchable) [size=4K]well
>>>
>>> For the record, the symptom observed in that machine was a hard freeze
>>> when attempting to set an EFI variable (XEN_EFI_set_variable).
>>>
>>> Fix by not adding regions with type EfiMemoryMappedIO or
>>> EfiMemoryMappedIOPortSpace to the e820 memory map, that allows BARs to
>>> be positioned there.
>>>
>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned')
>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>
>> In the best case this is moving us from one way of being wrong to another:
>> So far we wrongly include BARs in E820_RESERVED (_if_ they can be
>> legitimately covered by a EfiMemoryMappedIO region in the first place,
>> which I'm not sure is actually permitted - iirc just like E820_RESERVED
>> may not be used for BARs, this memory type also may not be), whereas with
>> your change we would no longer report non-BAR MMIO space (chipset specific
>> ranges for example) as reserved. In fact I think the example you provide
>> is at least partly due to bogus firmware behavior: The BAR is put in space
>> normally used for firmware specific memory (MMIO) ranges. I think firmware
>> should either assign the BAR differently or exclude the range from the
>> memory map.
> 
> Hm, I'm not sure the example is bogus, how would firmware request a BAR
> to be mapped for run time services to access it otherwise if it's not
> using EfiMemoryMappedIO?
> 
> Not adding the BAR to the memory map in any way would mean the OS is
> free to not map it for runtime services to access.

My view is that BARs should not be marked for runtime services use. Doing
so requires awareness of the driver inside the OS, which I don't think
one can expect. If firmware needs to make use of a device in a system, it
ought to properly hide it from the OS. Note how the potential sharing of
an RTC requires special provisions in the spec, mandating driver awareness.

Having a BAR expressed in the memory map also contradicts the ability of
an OS to relocate all BARs of all devices, if necessary.

>> I guess instead we want to handle the example you give by a firmware quirk
>> workaround.
> 
> I'm unconvinced we need a quirk for this. AFAICT such usage of
> EfiMemoryMappedIO doesn't go against the UEFI spec, and hence we need
> to handle it without requiring specific firmware quirks.
> 
>>> ---
>>> I don't understand the definition of EfiMemoryMappedIOPortSpace:
>>>
>>> "System memory-mapped IO region that is used to translate memory
>>> cycles to IO cycles by the processor."
>>
>> That's something (only?) IA-64 used, where kind of as a "replacement" for
>> x86 I/O port accesses equivalents thereof were provided (iirc 4 ports
>> per page) via MMIO accesses. It is this compatibility MMIO space which is
>> marked this way. Such ranges should never be seen on (current) x86.
> 
> I've heard the Arm guys speak about something similar.
> 
> There's a clarification note in newer versions of the UEFI spec:
> 
> "Note: There is only one region of type EfiMemoryMappedIoPortSpace
> defined in the architecture for Itanium-based platforms. As a result,
> there should be one and only one region of type
> EfiMemoryMappedIoPortSpace in the EFI memory map of an Itanium-based
> platform."
> 
>>> But given its name I would assume it's also likely used to mark ranges
>>> in use by PCI device BARs.
>>>
>>> It would also be interesting to forward this information to dom0, so
>>> it doesn't attempt to move the BARs of this device(s) around, or else
>>> issues will arise.
>>
>> None of this is device specific. There's simply (typically) one MMIO
>> range covering the entire 64k or I/O port space.
> 
> So this translation region won't be in a BAR of a host bridge for
> example?

I have to admit that I don't recall at which layer the conversion happens.
I also didn't think (host) bridges would typically have BARs. Bridges (but
iirc not host bridges) have bridge windows, but that's different.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.