[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in kernel config


  • To: Julien Grall <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>
  • From: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>
  • Date: Thu, 22 Jul 2021 13:49:34 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=k+agHEIlwZbHm70PGXM7ln7UnNK/AIUx+d5IuIsJP9w=; b=VxViozB5TRlo8SYAbS8mKou2p8mT/LjaSrUt5mhueLPmsaTwOrq8B2xCBkoFY1OsmeME4BCzFcCb2oxud62s9SMAVwP2NsN92tKFQtGuY8dyztdOU+1rTC7Q8FSgkWNUlpBjUawwrIfm3SDQ91UXcUc6kp5P/hnmtaUWGWe7J52EBPwRBovfKl+NH53DXhcuoYdamoDjj67gm720GzYP1YQsiHbkysfSzmn2ipeguUhrpajVFLHG2uLA4jprWYV01zfpiThx6OJ+I61n7H/KH7/MGuJzS21v4w29eflTXM3LzQy7VvF3SXL+ZSkbiv6zXa58C01lITmF3YRLv8rlNw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OijlUTdITYx2pxPu6CInatp2I6um/B5Q7VJ07ReAbNCbHBBVytkI6Bz2xlk0jRRO1T2FoXy2yP/YGM6WuFxVXHWoDbPfPv6pSkajN+YIfDwGQSyxhfAfHSb/LBze1bD/Mf156CKVB7/khX2fJnUwqewNZG1toXrv/HgQos//y9WEAuDblY8Ac8jIVS5veNplY0zvXfC9Nj2/UAnn12n+WKB7MAOB6J5VXSS3MW/pEQayxW4rjp6S4DXJh5ZJYD7MNjhyXOUYyH5icB2NR+n4gsEwPtAR2AISaX905dTL9JItF+EIbDyXpDfCvzi+4//7VsFm29iviGfQx55FAuUPDg==
  • Authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=epam.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrii Anisov <Andrii_Anisov@xxxxxxxx>
  • Delivery-date: Thu, 22 Jul 2021 13:49:52 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXfk7sayRbwafWg0WziHoeeCAd5atNqbDZgACX0oCAAHofAIAAQ83G
  • Thread-topic: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in kernel config

Hello Julien and Stefano,

This is actually a good point. There are two other possible issues:
    1) The kernel and the hypervisor may overlaps each other.
    2) The size of the kernel is not correctly provided.

I remember hitting such issues in the past and they will lead to weird 
issues.

In fact looking at the device-tree provided in the first e-mail, I see:

                 module@0 {
                         compatible = "xen,linux-zimage", 
"xen,multiboot-module";
                         reg = <0x5 0x1000000 0x0 0x2000000>;
                 };

However from the pastebin, U-boot will report for the kernel:

Bytes transferred = 37124608 (2367a00 hex)

So, if I am not mistaken, the region in the DT is smaller than the 
kernel itself. The Image header doesn't provide the binary size, so Xen 
can't do any sanity check.

In this case, we would copy a truncated kernel. Can you change in the 
size in the DT and give another try?


If you haven't one yet, I would highly recommend to have script (either 
a U-boot one or outside) that will generate the correct DT for a given 
kernel, xen, initramfs. We have some example scripts on the wiki for 
either solution.
Thank you very much for the suggestion. It appears to be the cause of the issue. Issue was fixed once I increased region in DT. I should have checked this at the very beginning. 
The most interesting thing that kernel size is the same, regardless of krobe/uprobe events are on or off. But error appears only if kprobe/uprobe events are on. 
In any case, thank you very much for your help.

Best regards,
Oleksii


From: Julien Grall <julien@xxxxxxx>
Sent: Thursday, July 22, 2021 12:29 PM
To: Stefano Stabellini <sstabellini@xxxxxxxxxx>; Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx <xen-devel@xxxxxxxxxxxxxxxxxxxx>; Andrii Anisov <Andrii_Anisov@xxxxxxxx>
Subject: Re: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in kernel config
 
Hi Stefano and Oleksii,


On 22/07/2021 03:12, Stefano Stabellini wrote:
> On Wed, 21 Jul 2021, Oleksii Moisieiev wrote:
>> Please see my answers below.
>>
>> ___________________________________________________________________________________________________________________________________________
>> From: Julien Grall <julien@xxxxxxx>
>> Sent: Wednesday, July 21, 2021 7:39 PM
>> To: Oleksii Moisieiev <Oleksii_Moisieiev@xxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx <xen-devel@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Andrii Anisov <Andrii_Anisov@xxxxxxxx>; Stefano Stabellini <sstabellini@xxxxxxxxxx>
>> Subject: Re: [arm] Dom0 hangs after enable KROBE_EVENTS and/or UPROBE_EVENTS in kernel config
>>        On 21/07/2021 15:40, Oleksii Moisieiev wrote:
>>        > Hello Julien,
>>
>>        Hello,
>>
>>        >>>
>>        >>> My setup:
>>        >>> Board: H3ULCB Kinfisher board
>>        >>> Xen: revision dba774896f7dd74773c14d537643b7d7477fefcd (stable-4.15)
>>        >>>https://urldefense.com/v3/__https://github.com/xen-project/xen.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Q
>>        ri3haIaks6cbo7Ri$
>>        ><https://urldefense.com/v3/__https://github.com/xen-project/xen.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8
>>        Qri3haIaks6cbo7Ri$>[github[.]com]
>>        >
>>        >>><https://urldefense.com/v3/__https://github.com/xen-project/xen.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8
>>        Qri3haIaks6cbo7Ri$
>>        > [github[.]com]>;
>>        >>> Kernel: revision 09162bc32c880a791c6c0668ce0745cf7958f576 (v5.10-rc4)
>>        >
>>        >>Hmmm... 5.10 was released a few months ago and there are probably a few
>>        >>stable release for the version. Can you try the latest 5.10 stable?
>>        >
>>        > Switched to tag v5.10 rev: 2c85ebc57b3e of
>>        >https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!hJARiSsCASVNpAQxrnN-7sFsVHHTS39sjRraLqBkD6AoaCbplgoyi
>>        v-iCGlHhXafbPNc$ [github[.]com]
>>        > and got the same problem, that I see no output from kernel. All tests
>>        > were done with earlycon parameter set in the kernel cmdline.
>>        The tag v5.10 is the first official release. What I meant is using the
>>        stable branch from
>>        git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git (branch
>>        linux-5.10.y).
>>
>> I need some time to download and build mainline kernel. I'll test this scenario and send you results tomorrow.
>
> I tried 5.10 with:
>
> CONFIG_KPROBE_EVENTS=y
> CONFIG_UPROBE_EVENTS=y
>
> and I could boot without issues on Xilinx ZynqMP.
>
>
>
>>        >>>
>>        >>>https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Qr
>>        i3haIaks29w69MC$
>>        ><https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Q
>>        ri3haIaks29w69MC$>[github[.]com]
>>        >
>>        >>><https://urldefense.com/v3/__https://github.com/torvalds/linux.git__;!!GF_29dbcQIUBPA!m4NHC2XbbSHWWZjQ7CX1ZZhaET6l0bQhZo581jtCmpst8E8JBp8Q
>>        ri3haIaks29w69MC$
>>        > [github[.]com]>;
>>        >>>
>>        >>> kernel config: see attached;
>>        >>>
>>        >>> dtb: see attached;
>>        >
>>        >>Please avoid large attachment as they will be duplicated on every
>>        >>mailbox. Instead, in the future, please upload them somewhere (your own
>>        >>webserve, pastebin...) and provide a link in the e-mail.
>>        >
>>        > I'm sorry for that.
>>        >
>>        >>>
>>        >>>
>>        >>> If kprobe/uprobe events are enabled - I see no output after xen switched
>>        >>> input to Dom0, if disabled - system boots up successfully.
>>        >>The console subsystem tends to be enabled quite late in the boot
>>        >>process. So this may mean a panic during early boot.
>>        >
>>        >>If you haven't done yet, I would suggest to add earlycon=xenboot on the
>>        >>dom0 command line. This will print some messages during early boot.
>>        >>ing.
>>        >
>>        > All tests were done with earlycon parameter set in the kernel command
>>        > line (xen, dom0-bootargs).
>>        >
>>        >>>
>>        >>> Both configs work fine when I boot without xen.
>>        >>>
>>        >>>
>>        >>> Dom0 information from Xen console shows that only one CPU works, and PC
>>        >>> stays in "__arch_counter_get_cntvct" function on read_sysreg call. //
>>        >>>
>>        >>> I did further investigation and found that kernel 5.4 doesn't have such
>>        >>> kind of issues.
>>        >>> After bisecting kernel,between 5.10 and 5.4, I found that output
>>        >>> disappeared on commit:
>>        >>>
>>        >>> 76085aff29f585139a37a10ea0a7daa63f70872c
>>        >
>>        >> From the information you provided so far, I am a bit confused how this
>>        >>could be the source of the problem. But given this is not the latest
>>        >>5.10, I will wait for you to confirm the bug is still present before
>>        >>providing more input.
>>        >
>>        > I was confused with this commit either. As I mentioned above, I've
>>        > checked with the latest stable 5.10 kernel and still got the same problem.
>>
>>        Thanks for the testing. I am not quite too sure where this may fail.
>>        Maybe Stefano has an idea?
>
> Are you booting with bootefi? (I cannot see any issues with or without
> bootefi.)
>
> In any case, the fact that you need to revert
> 76085aff29f585139a37a10ea0a7daa63f70872c to see the printk output is
> very odd. It might point to an alignment problem or another memory
> issue. It is possible that the weirdness you are seeing below (e.g. "we
> get some 18446744073709551615 while expecting 0") is due to a memory
> corruption.
>
> Given that 76085aff29f585139a37a10ea0a7daa63f70872c is changing some
> section alignment from 4K to 64K, it increases the memory used to load
> the kernel. Is it possible that the size increase is causing you to go
> beyond the address range supposed to be used? E.g. U-Boot loading the
> kernel at invalid addresses.
>
> Things like CONFIG_KPROBE_EVENTS=y and CONFIG_UPROBE_EVENTS=y are
> relevant because they increase the size of the kernel, possibly pushing
> it to an invalid memory range?

This is actually a good point. There are two other possible issues:
    1) The kernel and the hypervisor may overlaps each other.
    2) The size of the kernel is not correctly provided.

I remember hitting such issues in the past and they will lead to weird
issues.

In fact looking at the device-tree provided in the first e-mail, I see:

                 module@0 {
                         compatible = "xen,linux-zimage",
"xen,multiboot-module";
                         reg = <0x5 0x1000000 0x0 0x2000000>;
                 };

However from the pastebin, U-boot will report for the kernel:

Bytes transferred = 37124608 (2367a00 hex)

So, if I am not mistaken, the region in the DT is smaller than the
kernel itself. The Image header doesn't provide the binary size, so Xen
can't do any sanity check.

In this case, we would copy a truncated kernel. Can you change in the
size in the DT and give another try?


If you haven't one yet, I would highly recommend to have script (either
a U-boot one or outside) that will generate the correct DT for a given
kernel, xen, initramfs. We have some example scripts on the wiki for
either solution.

>
> You can go and edit 76085aff29f585139a37a10ea0a7daa63f70872c to change
> from 4K to any multiple of 4K, e.g. 8K, 12K, 16K, 20K. They should all
> work the same.
>
> Looking at the boot logs on pastebin I noticed that Xen is not loaded at
> a 2MB aligned address. I recommend you change Xen loading address to
> 0x500200000. And the kernel loading address to 0x500400000.

I am curious to know why you recommend to load at 2MB aligned address.
The Image protocol doesn't require to load a 2MB aligned address. In
fact, we add issue on Juno because the bootloader would load Xen at a
4KB address. UEFI will also load at a 4KB align address.

Cheers,

--
Julien Grall

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.