[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 3/4] efi: Enable booting unified hypervisor/kernel/initrd images



On 14.09.2020 13:50, Trammell Hudson wrote:
> --- a/docs/misc/efi.pandoc
> +++ b/docs/misc/efi.pandoc
> @@ -116,3 +116,52 @@ Filenames must be specified relative to the location of 
> the EFI binary.
>  
>  Extra options to be passed to Xen can also be specified on the command line,
>  following a `--` separator option.
> +
> +## Unified Xen kernel image
> +
> +The "Unified" kernel image can be generated by adding additional
> +sections to the Xen EFI executable with objcopy, similar to how
> +[systemd-boot uses the stub to add them to the Linux 
> kernel](https://wiki.archlinux.org/index.php/systemd-boot#Preparing_a_unified_kernel_image)
> +
> +The sections for the xen configuration file, the dom0 kernel, dom0 initrd,
> +XSM and CPU microcode should be added after the Xen `.pad` section, the
> +ending address of which can be located with:
> +
> +```
> +objdump -h xen.efi \
> +     | perl -ane '/\.pad/ && printf "0x%016x\n", hex($F[2]) + hex($F[3])'
> +```
> +
> +For all the examples the `.pad` section ended at 0xffff82d041000000.
> +All the sections are optional (`.config`, `.kernel`, `.ramdisk`, `.xsm`,
> +`.ucode` (x86) and `.dtb` (ARM)) and the order does not matter.
> +The virtual addresses do not need to be contiguous, although they should not
> +be overlapping and should all be greater than the last virtual address of the
> +hypervisor components.

The .pad section is there really only for padding the image. Its space
could in principle be used for placing useful stuff (and hence to limit
overall in-memory image size). That said, there is a plan for a change
which may involve using the initial part of .pad, but that's not certain
yet. I'm pointing this out to clarify that there may be a valid reason
to avoid re-using the .pad space, at least for now.

> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -375,27 +375,36 @@ static void __init noreturn 
> efi_arch_post_exit_boot(void)
>      efi_xen_start(fdt, fdt_totalsize(fdt));
>  }
>  
> -static void __init efi_arch_cfg_file_early(EFI_FILE_HANDLE dir_handle, char 
> *section)
> +static void __init efi_arch_cfg_file_early(const EFI_LOADED_IMAGE *image,
> +                                           EFI_FILE_HANDLE dir_handle,
> +                                           char *section)

Could I talk you into constifying "section" at this occasion - afaics
there should be no fallout here or in the other three places where
the same would apply.

> --- a/xen/arch/x86/efi/efi-boot.h
> +++ b/xen/arch/x86/efi/efi-boot.h
> @@ -272,14 +272,21 @@ static void __init noreturn 
> efi_arch_post_exit_boot(void)
>      unreachable();
>  }
>  
> -static void __init efi_arch_cfg_file_early(EFI_FILE_HANDLE dir_handle, char 
> *section)
> +static void __init efi_arch_cfg_file_early(const EFI_LOADED_IMAGE *image,
> +                                           EFI_FILE_HANDLE dir_handle,
> +                                           char *section)
>  {
>  }
>  
> -static void __init efi_arch_cfg_file_late(EFI_FILE_HANDLE dir_handle, char 
> *section)
> +static void __init efi_arch_cfg_file_late(const EFI_LOADED_IMAGE *image,
> +                                          EFI_FILE_HANDLE dir_handle,
> +                                          char *section)
>  {
>      union string name;
>  
> +    if ( read_section(image, ".ucode", &ucode, NULL) )
> +        return;
> +
>      name.s = get_value(&cfg, section, "ucode");

With the Arm change already in mind and with further similar
changes further down, may I suggest to consider passing
'section' into read_section(), thus guaranteeing consistent
naming between image section and config file items, not only now
but also going forward? read_section() would then check for the
leading dot followed by the specified name.

> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -121,6 +121,8 @@ static CHAR16 *s2w(union string *str);
>  static char *w2s(const union string *str);
>  static bool read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>                        struct file *file, char *options);
> +static bool read_section(const EFI_LOADED_IMAGE *image,
> +                         char *name, struct file *file, char *options);
>  static size_t wstrlen(const CHAR16 * s);
>  static int set_color(u32 mask, int bpp, u8 *pos, u8 *sz);
>  static bool match_guid(const EFI_GUID *guid1, const EFI_GUID *guid2);
> @@ -623,6 +625,27 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, 
> CHAR16 *name,
>      return true;
>  }
>  
> +static bool __init read_section(const EFI_LOADED_IMAGE *image,
> +                                char *const name, struct file *file,
> +                                char *options)
> +{
> +    /* skip the leading "." in the section name */
> +    union string name_string = { .s = name + 1 };
> +
> +    file->ptr = (void *)pe_find_section(image->ImageBase, image->ImageSize,
> +                                        name, &file->size);

This casting away of const-ness worries me. The sole reason why
the "ptr" member is non-const looks to be the two parsing functions
for the config file. How about we make "ptr" const void * and add a
new "char *str" field? While it won't _guarantee_ correct code to
be written, it at least allows doing so.

> @@ -1207,9 +1230,13 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
> *SystemTable)
>          /* Get the file system interface. */
>          dir_handle = get_parent_handle(loaded_image, &file_name);
>  
> -        /* Read and parse the config file. */
> -        if ( !cfg_file_name )
> +        if ( read_section(loaded_image, ".config", &cfg, NULL) )
> +        {
> +            PrintStr(L"Using unified config file\r\n");
> +        }

Please omit the braces here.

> @@ -1258,29 +1285,39 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
> *SystemTable)
>          if ( !name.s )
>              blexit(L"No Dom0 kernel image specified.");
>  
> -        efi_arch_cfg_file_early(dir_handle, section.s);
> +        efi_arch_cfg_file_early(loaded_image, dir_handle, section.s);
>  
>          option_str = split_string(name.s);
> -        read_file(dir_handle, s2w(&name), &kernel, option_str);
> -        efi_bs->FreePool(name.w);
> -
> -        if ( !EFI_ERROR(efi_bs->LocateProtocol(&shim_lock_guid, NULL,
> -                        (void **)&shim_lock)) &&
> -             (status = shim_lock->Verify(kernel.ptr, kernel.size)) != 
> EFI_SUCCESS )
> -            PrintErrMesg(L"Dom0 kernel image could not be verified", status);
>  
> -        name.s = get_value(&cfg, section.s, "ramdisk");
> -        if ( name.s )
> +        if ( !read_section(loaded_image, ".kernel", &kernel, option_str) )
>          {
> -            read_file(dir_handle, s2w(&name), &ramdisk, NULL);
> +            read_file(dir_handle, s2w(&name), &kernel, option_str);

As before, I disagree with the idea of taking pieces from disk and
pieces from the unified image. If you continue to think this is a
reasonable thing to do, may I ask that you add a rationale of this
model to the description? And btw, my worry looks to not be without
reason, since ...

>              efi_bs->FreePool(name.w);
> +
> +            if ( !EFI_ERROR(efi_bs->LocateProtocol(&shim_lock_guid, NULL,
> +                            (void **)&shim_lock)) &&
> +                 (status = shim_lock->Verify(kernel.ptr, kernel.size)) != 
> EFI_SUCCESS )
> +                PrintErrMesg(L"Dom0 kernel image could not be verified", 
> status);
>          }
>  
> -        name.s = get_value(&cfg, section.s, "xsm");
> -        if ( name.s )
> +        if ( !read_section(loaded_image, ".ramdisk", &ramdisk, NULL) )
>          {
> -            read_file(dir_handle, s2w(&name), &xsm, NULL);
> -            efi_bs->FreePool(name.w);
> +            name.s = get_value(&cfg, section.s, "ramdisk");
> +            if ( name.s )
> +            {
> +                read_file(dir_handle, s2w(&name), &ramdisk, NULL);
> +                efi_bs->FreePool(name.w);
> +            }
> +        }

... the RAM disk (just taken as example) is optional. How do you express
the unified image's explicit intention to have no RAM disk with this
fallback approach? FAOD, even if objcopy allows to add empty sections
(didn't check), I'd consider an empty section different from an absent
one, i.e. the former meaning "empty RAM disk" while the latter says "no
RAM disk".

> +        if ( !read_section(loaded_image, ".xsm", &xsm, NULL) )
> +        {
> +            name.s = get_value(&cfg, section.s, "xsm");
> +            if ( name.s )
> +            {
> +                read_file(dir_handle, s2w(&name), &xsm, NULL);
> +                efi_bs->FreePool(name.w);
> +            }
>          }

The fallback approach may even have security implications here, as
(afaik) an XSM policy can also be used to increase privileges. The
builder of unified image, in omitting an XSM policy, may certainly
mean "use built-in defaults" rather than intending to allow further
overriding.

> --- /dev/null
> +++ b/xen/common/efi/pe.c
> @@ -0,0 +1,137 @@
> +/*
> + * xen/common/efi/pe.c
> + *
> + * PE executable header parser.
> + *
> + * Derived from 
> https://github.com/systemd/systemd/blob/master/src/boot/efi/pe.c
> + * commit 07d5ed536ec0a76b08229c7a80b910cb9acaf6b1
> + *
> + * Copyright (C) 2015 Kay Sievers <kay@xxxxxxxx>
> + * Copyright (C) 2020 Trammell Hudson <hudson@xxxxxxxx>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU Lesser General Public License as published by
> + * the Free Software Foundation; either version 2.1 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + */
> +
> +
> +#include "efi.h"
> +
> +struct DosFileHeader {
> +    UINT8   Magic[2];
> +    UINT16  LastSize;
> +    UINT16  nBlocks;
> +    UINT16  nReloc;
> +    UINT16  HdrSize;
> +    UINT16  MinAlloc;
> +    UINT16  MaxAlloc;
> +    UINT16  ss;
> +    UINT16  sp;
> +    UINT16  Checksum;
> +    UINT16  ip;
> +    UINT16  cs;
> +    UINT16  RelocPos;
> +    UINT16  nOverlay;
> +    UINT16  reserved[4];
> +    UINT16  OEMId;
> +    UINT16  OEMInfo;
> +    UINT16  reserved2[10];
> +    UINT32  ExeHeader;
> +} __attribute__((packed));
> +
> +#define PE_HEADER_MACHINE_ARM64         0xaa64
> +#define PE_HEADER_MACHINE_X64           0x8664
> +#define PE_HEADER_MACHINE_I386          0x014c

This list isn't meant to be a complete one anyway, so please omit
the I386 item as it's not needed anywhere.

> +struct PeFileHeader {
> +    UINT16  Machine;
> +    UINT16  NumberOfSections;
> +    UINT32  TimeDateStamp;
> +    UINT32  PointerToSymbolTable;
> +    UINT32  NumberOfSymbols;
> +    UINT16  SizeOfOptionalHeader;
> +    UINT16  Characteristics;
> +} __attribute__((packed));
> +
> +struct PeHeader {
> +    UINT8   Magic[4];
> +    struct PeFileHeader FileHeader;
> +} __attribute__((packed));

At the example of these two (i.e. may extend to others): When the
packed attribute doesn't really have any impact on structure layout
(and will just adversely affect alignof() when applied to the struct
or any of the fields), please omit it. We had a number of such
pointless attributes in the tree, and we had to drop them for one of
the more recent gcc versions to actually still compile our code
without warnings (in fact errors, due to out use of -Werror).

> +struct PeSectionHeader {
> +    UINT8   Name[8];

Better char?

> +const void *__init pe_find_section(const CHAR8 *image, const UINTN 
> image_size,
> +                                   const char *section_name, UINTN *size_out)
> +{
> +    const struct DosFileHeader *dos = (const void*)image;

If the type of "image" was "const void *", there wouldn't be any cast
needed here (and again further down). And I don't think you actually
need "image" to be a pointer to a particular type? Of course ...

> +    const struct PeHeader *pe;
> +    const struct PeSectionHeader *sect;
> +    const UINTN name_len = strlen(section_name);
> +    UINTN offset = 0;

Unnecessary initializer, and please fold ...

> +    UINTN i;

... with this line.

> +    if ( name_len > sizeof(sect->Name) ||
> +         image_size < sizeof(*dos) ||
> +         memcmp(dos->Magic, "MZ", 2) != 0 )
> +        return NULL;
> +
> +    offset = dos->ExeHeader;
> +    pe = (const void *) &image[offset];

... this then needs to become "image + offset".

> +
> +    offset += sizeof(*pe);
> +    if ( image_size < offset ||
> +         memcmp(pe->Magic, "PE\0\0", 4) != 0 )
> +        return NULL;
> +
> +    /* PE32+ Subsystem type */
> +#if defined(__arm__) || defined (__aarch64__)
> +    if ( pe->FileHeader.Machine != PE_HEADER_MACHINE_ARM64 )
> +        return NULL;
> +#elif defined(__x86_64__)
> +    if ( pe->FileHeader.Machine != PE_HEADER_MACHINE_X64 )
> +        return NULL;
> +#else
> +    /* unknown architecture */
> +    return NULL;
> +#endif

Instead of this, further up please #define a single constant (e.g.
PE_HEADER_MACHINE) to check against without any #ifdef-ary here.
This then also should lead to a build error (instead of the
function returning NULL at runtime) when no enabling was done for
a possible future port.

> +    offset += pe->FileHeader.SizeOfOptionalHeader;
> +
> +    for ( i = 0; i < pe->FileHeader.NumberOfSections; i++ )
> +    {
> +        sect = (const void *)&image[offset];

Please limit the scope of sect to the body of this loop, at which
point this assignment can become the initializer.

> +        if ( image_size < offset + sizeof(*sect) )
> +            return NULL;
> +
> +        if ( memcmp(sect->Name, section_name, name_len) != 0 ||
> +             image_size < sect->VirtualSize + sect->VirtualAddress )

Wouldn't this latter part of the condition better be treated as an
error, rather than getting silently ignored? The more if the falling
back to on-disk files got retained?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.