Xen project Mailing List

Re: [XEN PATCH 2/2] x86/Dom0: Use streaming decompression for ZSTD compressed kernels

To: Rafaël Kooi <rafael_andreas@xxxxxxxxxxx>

Date: Thu, 11 May 2023 08:08:21 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rPeDCyyi69C/QPSdoXyljUJNrvj5LSTzHr/5LDbgjIE=; b=iEXbwCC9Otdy9101/TerJRvJnQjYAGSAn34C1PrsmUEXJFBEa7uN0ua7VBufksBD633bBSfoYiH+OLvqhqRylVQHFDoLJt3o4HHLi8WKZMA9Zgzf3/LVISN2pvhm1MGmpxmjAhTZ3eR3NtXgXJNC8eTfDWM62chn4yfVLaK/02tDSkosu3B0goLJG9o8BYDqW+pXhsmZJy0iAW56Z72vRN9diIBShMc+Pr0L7kTQ32VikuXUNilyxYGSLn1Gs7scbFciMitxhBRuci8YQ6CZm3BCk7lFlkDWnW8G1ucut0ckhBSnsyaeJHo9uRxwtnyRzhpJ89hO95pyVsTZzDG7JQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QujaTB7cUuUw5xhipdJL0ncT3acgqqVFPRoF2STK+vZzqc0ZIqnrHNUYB9EzT3YXKcF5sWpMSEZkew9YKFaBXRyAiLyaPVbLRVKi3f691He77wvNLvawKNtYZNryqF6ufZ1gRSvsx/I4hizYlVqpSGcdIAZxUyjIEGGb2GKBpixCyIh97HfC5sXkuoAZiae8IamM2sRphkVTrR/U3+u8ER0+1Km2aykeJzDHAVeP6TzmmrP7M8STnTxhidqoHU8N5Ehs6ift4pmNNLBLilcZU0TmjkiZScqePPy9TRCHF44N4IVhLs1xHb6jrn50jECPgq+6DRTcTF4qW/CSi1n3oQ==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Thu, 11 May 2023 06:08:48 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 10.05.2023 19:30, Rafaël Kooi wrote: > On 10/05/2023 11:48, Jan Beulich wrote: >> On 10.05.2023 10:51, Rafaël Kooi wrote: >>> On 10/05/2023 10:03, Jan Beulich wrote:> On 10.05.2023 02:18, Rafaël Kooi >>> wrote: >>>>> --- a/xen/common/decompress.c >>>>> +++ b/xen/common/decompress.c >>>>> @@ -3,11 +3,26 @@ >>>>> #include <xen/string.h> >>>>> #include <xen/decompress.h> >>>>> >>>>> +typedef struct _ZSTD_state >>>>> +{ >>>>> + void *write_buf; >>>>> + unsigned int write_pos; >>>>> +} ZSTD_state; >>>>> + >>>>> static void __init cf_check error(const char *msg) >>>>> { >>>>> printk("%s\n", msg); >>>>> } >>>>> >>>>> +static int __init cf_check ZSTD_flush(void *buf, unsigned int pos, >>>>> + void *userptr) >>>>> +{ >>>>> + ZSTD_state *state = (ZSTD_state*)userptr; >>>>> + memcpy(state->write_buf + state->write_pos, buf, pos); >>>>> + state->write_pos += pos; >>>>> + return pos; >>>>> +} >>>> >>>> This doesn't really belong here, but will (I expect) go away anyway once >>>> you drop the earlier patch. >>>> >>> >>> The ZSTD_flush will have to stay, as that is how the decompressor will >>> start streaming decompression. The difference will be that the book >>> keeping will be "global" (to the translation unit). >> >> But this bookkeeping should be entirely in zstd code (i.e. presumably >> unzstd.c). >> > > The implementation of the decompression functions seem to indicate > otherwise. Referring to unzstd.c:`unzstd`, the function will take the > streaming decompression path if either `fill` or `flush` have been > supplied. I cross checked with unlzma.c and unxz.c, and that seems to > have similar behavior in regards to flushing the output data. The > `flush` function is passed a buffer to a chunk of decompressed data with > `pos` being the size of the chunk. For the sake of consistency I don't > think it's a good idea to deviate from this behavior in just unzstd.c. Well, so far we don't use any flush functions, do we? The question could therefore also be put differently: In how far is the flush function you introduce zstd-specific? If it isn't be other than the fact that it is only passed to unzstd(), perhaps it shouldn't be named as if zstd-specific? >>>>> if ( len >= 4 && !memcmp(inbuf, "\x28\xb5\x2f\xfd", 4) ) >>>>> - return unzstd(inbuf, len, NULL, NULL, outbuf, NULL, error); >>>>> + { >>>>> + // NOTE (Rafaël): On Arch Linux the kernel is compressed in a way >>>>> + // that requires streaming ZSTD decompression. Otherwise >>>>> decompression >>>>> + // will fail when using a unified EFI binary. Somehow >>>>> decompression >>>>> + // works when not using a unified EFI binary, I suspect this is >>>>> the >>>>> + // kernel self decompressing. Or there is a code path that I am >>>>> not >>>>> + // aware of that takes care of the use case properly. >>>> >>>> Along the lines of what I've said for the description, this wants to avoid >>>> terms like "somehow" if at all possible. >>> >>> I've used the term "somehow" because I don't know why decompression >>> works when Xen loads the kernel from the EFI file system. I assume the >>> kernel still gets unpacked by Xen, right? Or does the kernel unpack >>> itself? >> >> The handling of Dom0 kernel decompression ought to be entirely independent >> of EFI vs legacy. Unless I'm wrong with that (mis-remembering), you >> mentioning EFI is potentially misleading. And yes, at least on x86 the >> kernel is decompressed by Xen (by peeking into the supplied bzImage). The >> difference between a plain bzImage and a "unified EFI binary" is what you >> will want to outline in the description (and at least mention in the >> comment). What I'm wondering is whether there simply is an issue with size >> determination when the kernel is taken from the .kernel section. >> > > Assuming you are talking about size determination of the compressed > bzImage, I notice a discrepancy in the size of the ZSTD stream and the > reported size by the vmlinuz-* header upon further investigation. When > using the streaming decompression code I made it output how many bytes > it reads from the extracted-but-still-compressed bzImage. The code > reads 12,327,560 bytes, but the size of the compressed bzImage in the > header is 12,327,564 bytes. In xen/arch/x86/bzImage.c `decompress` is > called with `orig_image_len`, when the function `output_length` > calculates the end address and removes 4 bytes from that address. If I > remove the last 4 bytes from the compressed bzImage then `unzstd` will > unpack it with `unzstd bzImage.zst -o bzImage`, otherwise it will > complain with `zstd: /*stdin*\: unknown header`. With this new > information I think the correct solution is to try calling `decompress` > a second time with with `orig_image_len - 4` if it fails. That's not very likely to be an appropriate solution. If the sizes diverge by 4, that difference needs explaining. Once understood, it'll hopefully become clear under what conditions (if any) to adjust the length right away, without any need to retry. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.