[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/efi: Fix Grub2 boot on arm64
> On 3 Nov 2021, at 14:30, Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 03.11.2021 15:09, Luca Fancellu wrote: >>> On 3 Nov 2021, at 11:28, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>> On 03.11.2021 11:20, Luca Fancellu wrote: >>>>> On 3 Nov 2021, at 08:20, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>>> On 02.11.2021 18:12, Luca Fancellu wrote: >>>>>>> On 2 Nov 2021, at 14:45, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>>>>> On 02.11.2021 15:05, Luca Fancellu wrote: >>>>>>>> The code introduced by commit a1743fc3a9fe9b68c265c45264dddf214fd9b882 >>>>>>>> ("arm/efi: Use dom0less configuration when using EFI boot") is >>>>>>>> introducing a problem to boot Xen using Grub2 on ARM machine using >>>>>>>> EDK2. >>>>>>>> >>>>>>>> The problem comes from the function get_parent_handle(...) that inside >>>>>>>> uses the HandleProtocol on loaded_image->DeviceHandle, but the last >>>>>>>> is NULL, making Xen stop the UEFI boot. >>>>>>> >>>>>>> According to my reading the UEFI spec doesn't (explicitly) allow for >>>>>>> this to be NULL. Could you clarify why this is the case? What other >>>>>>> information may end up being invalid / absent? Is e.g. read_section() >>>>>>> safe to use? >>>>>> >>>>>> My test on an arm machine running Grub2 on top of EDK2 showed that >>>>>> when Xen is started, the get_parent_handle(…) call was failing and >>>>>> stopping >>>>>> the boot because the efi_bs->HandleProtocol(…) was called with the >>>>>> loaded_image->DeviceHandle argument NULL and the call was returning >>>>>> a EFI_INVALID_PARAMETER. >>>>>> So the parent handle can’t be requested and the filesystem can’t be used, >>>>>> but any other code that doesn’t use the handle provided by >>>>>> get_parent_handle(…) >>>>>> can be used without problem like read_section(...). >>>>> >>>>> I understand this. My question was for the reason of ->DeviceHandle >>>>> being NULL. IOW I'm wondering whether we're actually talking about a >>>>> firmware or GrUB bug, in which case your change is a workaround for >>>>> that rather than (primarily) a fix for the earlier Xen change. >>>> >>>> The issue was found only when using EDK2+Grub2, no issue when booting >>>> directly from EDK2. >>>> This is a fix for the regression, because without the EFI changes, Grub2 >>>> was >>>> booting successfully Xen. Using grub2 to boot Xen on arm is a very common >>>> solution so not supporting this anymore could lead to lots of people having >>>> issues if they update to Xen 4.16. >>> >>> I'm not objecting to addressing the issue. But the description needs >>> to make clear where the origin of the problem lies, and afaict that's >>> not the earlier Xen change. That one merely uncovered what, according >>> to your reply, might then be a GrUB bug. Unless, as said earlier, I >>> merely haven't been able to spot provisions in the spec for the field >>> in question to be NULL. >> >> Maybe I can rephrase to be more specific from: >> >> The problem comes from the function get_parent_handle(...) that inside >> uses the HandleProtocol on loaded_image->DeviceHandle, but the last >> is NULL, making Xen stop the UEFI boot. >> >> To: >> >> Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle, >> that is used by efi_bs->HandleProtocol(…) inside get_parent_handle(…), >> causing Xen to stop the boot getting an EFI_INVALID_PARAMETER error. >> >> Do you think it can be ok like this? > > Much better, yes, but I wonder what "returning" refers to. You want to > describe the origin of the NULL handle as precisely as possible. And > considering this turns out as a workaround, in a suitable place you > will also want to add a code comment, such that a later reader won't > decide this is all dead code and can be done in a simpler way. Ok I can write the issue from the beginning to be sure: Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle inside the interface given by the LOADED_IMAGE_PROTOCOL service, this handle is used later by efi_bs->HandleProtocol(…) inside get_parent_handle(…) when requesting the SIMPLE_FILE_SYSTEM_PROTOCOL interface, causing Xen to stop the boot because of an EFI_INVALID_PARAMETER error. Regarding the comment, I can rephrase this comment: /* * If DeviceHandle is NULL, we can't use the SIMPLE_FILE_SYSTEM_PROTOCOL * to have access to the filesystem. */ To be: /* * If DeviceHandle is NULL, the firmware offering the UEFI services might not be * compliant to the standard and we can't use the SIMPLE_FILE_SYSTEM_PROTOCOL * to have access to the filesystem. However the system can boot if and only if it doesn’t * require access to the filesystem. (e.g. Xen image has everything built in or the * bootloader did previously load every needed binary in memory) */ What do you think? Cheers, Luca > > Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |