[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)
On 30/05/2015 23:07, M A Young wrote: > On Fri, 29 May 2015, Andrew Cooper wrote: > >> On 29/05/15 12:17, M A Young wrote: >>>>> I did a bit of testing - xen-4.5.1-rc1 built on Fedora 22 (gcc5) doesn't >>>>> boot for me, but if I replace xen.gz with one from the same code built on >>>>> Fedora 21 (gcc4) then it does boot. There are rpms and build logs >>>>> available via >>>>> http://copr.fedoraproject.org/coprs/myoung/xentest/build/93366/ >>>>> if anyone else wants to do some testing. >>>>> >>>>> Michael Young >>>> Do you have easy access to xen-syms from each build? >>> Yes. >>> >> Thankyou very much. >> >> GCC 5 is indeed miscompiling the code. Comparing the fc21 vs fc22 builds: >> >> The C snippet from mmio_ro_do_page_fault(): >> >> struct page_info *page = mfn_to_page(mfn); >> struct domain *owner = page_get_owner_and_reference(page); >> if ( owner ) >> put_page(page); >> >> In fc21 is: >> >> movabs $0xffff82e000000000,%rbp >> shr %cl,%rax >> or %rdx,%rax >> shl $0x5,%rax >> add %rax,%rbp >> mov %rbp,%rdi >> callq ffff82d080186900 <page_get_owner_and_reference> >> test %rax,%rax >> mov %rax,%r12 >> je ffff82d080189c4e <mmio_ro_do_page_fault+0x11e> >> mov %rbp,%rdi >> callq ffff82d080188ec0 <put_page> >> >> and in fc22 is: >> >> movabs $0xffff82e000000000,%r8 >> shr %cl,%rax >> or %rdx,%rax >> shl $0x5,%rax >> lea (%r8,%rax,1),%rdi >> callq ffff82d0801874f0 <page_get_owner_and_reference> >> test %rax,%rax >> mov %rax,%rbp >> je ffff82d08018ca14 <mmio_ro_do_page_fault+0x114> >> mov %r8,%rdi >> callq ffff82d080189a90 <put_page> >> >> "lea (%r8,%rax,1),%rdi" in FC22 is slightly shorter than "add %rax,%rbp; >> mov %rbp,%rdi" in FC21. In both cases %rdi is now 'page' from the C >> snippet. >> >> In FC21, the result is stored in %rbp, then reloaded from %rbp into %rdi >> for call to put_page(). >> >> However, in FC22, the result of the calculation is only held in %rdi, >> and clobbered by the call to page_get_owner_and_reference(). When it >> comes to call put_page(), %r8 is reloaded, which is still a pointer to >> the base of the frametable, not the page we actually took a reference on. >> >> FC22 is miscompiling the C to: >> >> struct page_info *page = mfn_to_page(mfn); >> struct domain *owner = page_get_owner_and_reference(page); >> if ( owner ) >> put_page(mfn_to_page(0)); >> >> which is wrong, and why free_domheap_pages() does legitimately complain >> about the wonky refcount. > With a bit of experimentation I have found that compiling with the > -fno-caller-saves flag gets this code segment back to the Fedora 21 > version, thus avoiding the bug. After sending this email, I wondered whether the optimiser as assuming that %rdi was preserved. Indeed, it turns out that the generated code for page_get_owner_and_reference leaves %rdi unmodified, and safe for reuse after return. If the 'mov %r8,%rdi' were simply omitted, the code would work, as %rdi still contains the correct result of the original calculation. Therefore, I suspect that the bug is in the -fcaller-saves optimisation code. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |