[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17] x86/efi: don't translate EFI memory map MMIO regions to e820 RESERVED


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 4 Oct 2022 15:58:40 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AgVmzK68D2Jyq4zTvV5t3uOOLG3wVAeXmYV9fky4Krc=; b=VuqHzJd84Dd8KFAj++25nCvfQwKAz6If2CM5i4Be428KDwpZ6C0aPVrZKLNz9e2Ofn7PsaMH4bG7r6cKq70eT3JpveMcWa2FqmwcXSirQuHFDs5hlFvRNk3uQJdBAfWlP88lysGgtazbWrUt1b7L8tRVh7tEQb4/3l4x3h3FL5LNYSRAlvhwAJT50KG9R6OYEsyWyEtinbxO72W1AnVsUl4ex5+kJ/bfu1LpzSC7vCfcANarsN01N/AFmPiJEw/KBvgPXZfE44cl7Ta4Czy8SnCK6heIR76xpiSVT1D9ghQEyOyu5EEBsld+WmvfV72DvVEmhh9QzND/nPkVZs+kaw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e+YAbk6FF+1IX4pP01bEmLqlQmSJ6CjEJc0gov98Q7CKFPeMi0yoNMtudd1MsnNiyEYpX5fADzaBLVCztoNiYHJVFHQZEq+g9fnY4HMyQhcZZFtp+BoYTs73hKszwyaq6F8RV4PArOcGS0PDHt+Yok8uIZh1O91bB/9z9ItB6m7U11B1pbOT8gc0BWbBis4+U2nCc/2n+LXA6Q6wN7oevWOn7A7jLI7+kgNAlhalyKX1JXRtvNgcfUtoUDxa+nqWTM//RCsWk8r2ZA3hncasMIXhydjfDT+aKRLdjE3qMmHSfgDGxnEu5bTj9jebkQpVfAjUSfzl6VpnQF6zJ1MO1g==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 04 Oct 2022 13:58:49 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.10.2022 15:55, Roger Pau Monné wrote:
> On Tue, Oct 04, 2022 at 03:15:02PM +0200, Jan Beulich wrote:
>> On 04.10.2022 14:59, Roger Pau Monné wrote:
>>> On Tue, Oct 04, 2022 at 02:21:20PM +0200, Jan Beulich wrote:
>>>> On 04.10.2022 14:17, Roger Pau Monné wrote:
>>>>> On Tue, Oct 04, 2022 at 12:40:10PM +0200, Jan Beulich wrote:
>>>>>> On 04.10.2022 11:27, Roger Pau Monné wrote:
>>>>>>> On Tue, Oct 04, 2022 at 11:01:18AM +0200, Jan Beulich wrote:
>>>>>>>> On 30.09.2022 16:17, Roger Pau Monne wrote:
>>>>>>>>> The EFI memory map contains two memory types (EfiMemoryMappedIO and
>>>>>>>>> EfiMemoryMappedIOPortSpace) used to describe IO memory areas of
>>>>>>>>> devices used by EFI.
>>>>>>>>>
>>>>>>>>> The current parsing of the EFI memory map was translating
>>>>>>>>> EfiMemoryMappedIO and EfiMemoryMappedIOPortSpace to E820_RESERVED on
>>>>>>>>> x86.  This is an issue because device MMIO regions (BARs) should not
>>>>>>>>> be positioned on reserved regions.  Any BARs positioned on non-hole
>>>>>>>>> areas of the memory map will cause is_memory_hole() to return false,
>>>>>>>>> which would then cause memory decoding to be disabled for such device.
>>>>>>>>> This leads to EFI firmware malfunctions when using runtime services.
>>>>>>>>>
>>>>>>>>> The system under which this was observed has:
>>>>>>>>>
>>>>>>>>> EFI memory map:
>>>>>>>>> [...]
>>>>>>>>>  00000fd000000-00000fe7fffff type=11 attr=800000000000100d
>>>>>>>>> [...]
>>>>>>>>> 0000:00:1f.5 disabled: BAR [0xfe010, 0xfe010] overlaps with memory map
>>>>>>>>>
>>>>>>>>> The device behind this BAR is:
>>>>>>>>>
>>>>>>>>> 00:1f.5 Serial bus controller [0c80]: Intel Corporation Lewisburg SPI 
>>>>>>>>> Controller (rev 09)
>>>>>>>>>       Subsystem: Super Micro Computer Inc Device 091c
>>>>>>>>>       Flags: fast devsel
>>>>>>>>>       Memory at fe010000 (32-bit, non-prefetchable) [size=4K]well
>>>>>>>>>
>>>>>>>>> For the record, the symptom observed in that machine was a hard freeze
>>>>>>>>> when attempting to set an EFI variable (XEN_EFI_set_variable).
>>>>>>>>>
>>>>>>>>> Fix by not adding regions with type EfiMemoryMappedIO or
>>>>>>>>> EfiMemoryMappedIOPortSpace to the e820 memory map, that allows BARs to
>>>>>>>>> be positioned there.
>>>>>>>>>
>>>>>>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably 
>>>>>>>>> positioned')
>>>>>>>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>>>>>>>
>>>>>>>> In the best case this is moving us from one way of being wrong to 
>>>>>>>> another:
>>>>>>>> So far we wrongly include BARs in E820_RESERVED (_if_ they can be
>>>>>>>> legitimately covered by a EfiMemoryMappedIO region in the first place,
>>>>>>>> which I'm not sure is actually permitted - iirc just like E820_RESERVED
>>>>>>>> may not be used for BARs, this memory type also may not be), whereas 
>>>>>>>> with
>>>>>>>> your change we would no longer report non-BAR MMIO space (chipset 
>>>>>>>> specific
>>>>>>>> ranges for example) as reserved. In fact I think the example you 
>>>>>>>> provide
>>>>>>>> is at least partly due to bogus firmware behavior: The BAR is put in 
>>>>>>>> space
>>>>>>>> normally used for firmware specific memory (MMIO) ranges. I think 
>>>>>>>> firmware
>>>>>>>> should either assign the BAR differently or exclude the range from the
>>>>>>>> memory map.
>>>>>>>
>>>>>>> Hm, I'm not sure the example is bogus, how would firmware request a BAR
>>>>>>> to be mapped for run time services to access it otherwise if it's not
>>>>>>> using EfiMemoryMappedIO?
>>>>>>>
>>>>>>> Not adding the BAR to the memory map in any way would mean the OS is
>>>>>>> free to not map it for runtime services to access.
>>>>>>
>>>>>> My view is that BARs should not be marked for runtime services use. Doing
>>>>>> so requires awareness of the driver inside the OS, which I don't think
>>>>>> one can expect. If firmware needs to make use of a device in a system, it
>>>>>> ought to properly hide it from the OS. Note how the potential sharing of
>>>>>> an RTC requires special provisions in the spec, mandating driver 
>>>>>> awareness.
>>>>>>
>>>>>> Having a BAR expressed in the memory map also contradicts the ability of
>>>>>> an OS to relocate all BARs of all devices, if necessary.
>>>>>
>>>>> I've failed to figure out if there's a way in UEFI to report a device
>>>>> is in use by the firmware.  I've already looked before sending the
>>>>> patch (see also the post commit notes about for example not passing
>>>>> through the device to any guest for obvious reason).
>>>>>
>>>>> I've got no idea if Linux has any checks to avoid trying to move BARs
>>>>> residing in EfiMemoryMappedIO ranges, we have now observed this
>>>>> behavior in two systems already.
>>>>>
>>>>> Maybe we could do a special check for PCI devices and allow them
>>>>> having BARs in EfiMemoryMappedIO, together with printing a warning
>>>>> message.
>>>>
>>>> Right, that's one of the possible quirk workarounds I was thinking of.
>>>> At the risk of stating the obvious - the same would presumably apply to
>>>> E820_RESERVED on non-EFI systems then.
>>>
>>> One option would be to strictly limit to EfiMemoryMappedIO, by taking
>>> the EFI memory map into account also if present.
>>>
>>> Another maybe simpler option is to allow BARs to be placed in
>>> E820_RESERVED regions, and translate EfiMemoryMappedIO into
>>> E820_RESERVED like we have been doing.
>>>
>>> I will attempt the later if you are OK with the approach.
>>
>> I might be okay with the approach, but first of all I continue to be
>> worried of us violating specifications if we make this the default
>> behavior.
> 
> In any case it would be the firmware violating the specification by
> not hiding those PCI devices, Xen is just trying to cope.
> 
> Xen went from not checking the position of the BARs at all to
> enforcing them to not be placed over any regions on the memory map. I
> think we need to relax the checks a bit to match reality.

Sure, as a workaround. Yet we don't want to relax too much, or else a
primary purpose of the check will be lost.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.