[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [SUSPECTED SPAM]Xen-unstable :can't boot HVM guests, bisected to commit: "hvmloader: indicate ACPI tables with "ACPI data" type in e820"


  • To: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Jan Beulich" <jbeulich@xxxxxxxx>
  • From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
  • Date: Sun, 11 Oct 2020 12:20:42 +0100
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Delivery-date: Sun, 11 Oct 2020 11:20:49 +0000
  • Ironport-sdr: bSo58SLb3wVD/sgPRXTPLVptudqNtD6RB6mA874kVyViib1g39LSL8y95XToWlxwRatKP6XCsO 07EU6SLshljox5IlFClPvDBe+sm9wN3BYSViJ43sks6aijuDL2vuYz3CiaY1Ejm3+R3Df5B+0s s6QdI7btBwFyKbuBvprWZ16FI2/zCNCbMo0R4lIsIDAlodj2zWuprAJEBsz2Z86ITjUsBVNWL9 kDZgdhuh3cQTzl0k0jIPhG6akr/MfQGIKd/MXSpDouzPaqC21cVuCA4ANc2KShmHM56ZZ1kK92 jM4=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11/10/2020 11:40, Igor Druzhinin wrote:
> On 11/10/2020 10:43, Sander Eikelenboom wrote:
>> On 11/10/2020 02:06, Igor Druzhinin wrote:
>>> On 10/10/2020 18:51, Sander Eikelenboom wrote:
>>>> Hi Igor/Jan,
>>>>
>>>> I tried to update my AMD machine to current xen-unstable, but
>>>> unfortunately the HVM guests don't boot after that. The guest keeps
>>>> using CPU-cycles but I don't get to a command prompt (or any output at
>>>> all). PVH guests run fine.
>>>>
>>>> Bisection leads to commit:
>>>>
>>>> 8efa46516c5f4cf185c8df179812c185d3c27eb6
>>>> hvmloader: indicate ACPI tables with "ACPI data" type in e820
>>>>
>>>> I tried xen-unstable with this commit reverted and with that everything
>>>> works fine.
>>>>
>>>> I attached the xl-dmesg output.
>>>
>>> What guests are you using? 
>> Not sure I understand what you ask for, but:
>> dom0 PV
>> guest HVM (qemu-xen)
>>
>>> Could you get serial output from the guest?
>> Not getting any, it seems to be stuck in very early boot.
>>
>>> Is it AMD specific?
>> Can't tell, this is the only machine I test xen-unstable on.
>> It's a AMD phenom X6.
>> Both dom0 and guest kernel are 5.9-rc8.
>>
>> Tested with guest config:
>> kernel      = '/boot/vmlinuz-xen-guest'
>> ramdisk     = '/boot/initrd.img-xen-guest'
>>
>> cmdline     = 'root=UUID=7cc4a90d-d6b0-4958-bb7d-50497aa29f18 ro
>> nomodeset console=tty1 console=ttyS0 console=hvc0 earlyprintk=xen'
>>
>> type='hvm'
>>
>> device_model_version = 'qemu-xen'
>>
>> cpus        = "2-5"
>> vcpus = 2
>>
>> memory      = '512'
>>
>> disk        = [
>>                   'phy:/dev/xen_vms_ssd/media,xvda,w'
>>               ]
>>
>> name        = 'guest'
>>
>> vif         = [ 'bridge=xen_bridge,ip=192.168.1.10,mac=00:16:3E:DC:0A:F1' ]
>>
>> on_poweroff = 'destroy'
>> on_reboot   = 'restart'
>> on_crash    = 'preserve'
>>
>> vnc=0
>>
>>
>>> If it's a Linux guest could you get a stacktrace from
>>> the guest using xenctx?
>>
>> It is, here are few subsequent runs:
>>
>> ~# /usr/local/lib/xen/bin/xenctx -s
>> /boot/System.map-5.9.0-rc8-20201010-doflr-mac80211debug+ -f -a -C 4
>> vcpu0:
>> cs:eip: ca80:00000256
> 
> Ok, it's stuck in linuxboot.bin option ROM. That's not something we test in 
> Citrix -
> we don't use fw_cfg. It could be something with caching (given it's moving 
> but slowly) or a
> bug uncovered by memory map changes. I'll try to get a repro on Monday.

Right, I think I know what will fix your problem - could you flip "ACPI data"
type to "ACPI NVS" in my commit.

Jan, this is what we've discussed on the list as an ambiguity in ACPI spec but
couldn't reach a clean resolution after all.
SeaBIOS thinks that "ACPI data" type is essentially RAM that could be reported
as RAM resource to the guest in E801.
https://wiki.osdev.org/Detecting_Memory_(x86)#BIOS_Function:_INT_0x15.2C_AX_.3D_0xE801

// Calculate the maximum ramsize (less than 4gig) from e820 map.
static void
calcRamSize(void)
{
    u32 rs = 0;
    int i;
    for (i=e820_count-1; i>=0; i--) {
        struct e820entry *en = &e820_list[i];
        u64 end = en->start + en->size;
        u32 type = en->type;
        if (end <= 0xffffffff && (type == E820_ACPI || type == E820_RAM)) {
            rs = end;
            break;
        }
    }
    LegacyRamSize = rs >= 1024*1024 ? rs : 1024*1024;
}

what is wrong here I think is that it clearly doesn't handle holes and worked 
more
by luck. So SeaBIOS needs to be fixed but I think that using ACPI NVS in 
hvmloader
is still safer.

Igor



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.