Re: [Xen-devel] bad page flags booting 32bit dom0 on 64bit hypervisor using dom0_mem (kernel >=4.2)

On 02.05.2016 13:41, Juergen Gross wrote:
> On 02/05/16 12:47, Stefan Bader wrote:
>> I recently tried to boot 32bit dom0 on 64bit Xen host which I configured to 
>> run
>> with a limited, fix amount of memory for dom0. It seems that somewhere 
>> between
>> kernel versions 3.19 and 4.2 (sorry that is still a wide range) the Linux 
>> kernel
>> would report bad page flags for a range of pages (which seem to be around the
>> end of the guest pfn range). For a 4.2 kernel that was easily missed as the 
>> boot
>> finished ok and dom0 was accessible. However starting with 4.4 (tested 4.5 
>> and a
>> 4.6-rc) the serial console output freezes after some of those bad page flag
>> messages and then (unfortunately without any further helpful output) the host
>> reboots (I assume there is a panic that triggers a reset).
>> I suspect the problem is more a kernel side one. It is just possible to
>> influence things by variation of dom0_mem=#,max:#. 512M seems ok, 1024M, 
>> 2048M,
>> and 3072M cause bad page flags starting around kernel 4.2 and reboots around
>> 4.4. Then 4096M and not clamping dom0 memory seem to be ok again (though not
>> limiting dom0 memory seems to cause trouble on 32bit dom0 later when a domU
>> tries to balloon memory, but I think that is a different problem).
>> I have not seen this on a 64bit dom0. Below is an example of those bad page
>> errors. Somehow it looks to be a page marked as reserved. Initially I 
>> wondered
>> whether this could be a problem of not clearing page flags when moving 
>> mappings
>> to match the e820. But I never looked into i386 memory setup in that detail. 
>> So
>> I am posting this, hoping that someone may have an idea from the detail about
>> where to look next. PAE is enabled there. Usually its bpf init that gets hit 
>> but
>> that likely is just because that is doing the first vmallocs.
> Could you please post the kernel config, Xen and dom0 boot parameters?
> I'm quite sure this is no common problem as there are standard tests
> running for each kernel version including 32 bit dom0 with limited
> memory size.

Hi Jürgen,

sure. Though by doing that I realized where I actually messed the whole thing
up. I got the max limit syntax completely wrong. :( Instead of the correct
"dom0_mem=1024M,max:1024M" I am using "dom0_mem=1024M:max=1024M" which I guess
is like not having max set at all. Not sure whether that is a valid use case.

When I actually do the dom0_mem argument right, there are no bad page flag
errors even in 4.4 with 1024M limit. I was at least consistent in my
mis-configuration, so doing the same stupid thing on 64bit seems to be handled
more gracefully.

Likely false alarm. But at least cut&pasting the config into mail made me spot
the problem...


The xen boot parameters:
dom0_mem=1024M:max=1024M loglvl=all guest_loglvl=all hvm_debug=0 com1=57600,8n1

The dom0 boot parameters:
nomodeset loglevel=10 earlyprintk=xenboot console=tty0 console=hvc0

The config is attached (for size reasons). I picked the 4.2 config as that
kernel only has the errors but allows me to boot into something that appears to
be working ok.


> Juergen

