[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Linux 4.1 reports wrong number of pages to toolstack
On Fri, 2015-09-04 at 01:40 +0100, Wei Liu wrote: > Hi David > > This issue is exposed by the introduction of migration v2. The symptom is that > a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > many pages. FWIW my adhoc tests overnight gave me: 37858: b953c0d234bc72e8489d3bf51a276c5c4ec85345 v4.1 Fail 37862: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 v4.0 Fail 37860: bfa76d49576599a4b9f9b7a71f23d73d6dcff735 v3.19 Fail 37872: e36f014edff70fc02b3d3d79cead1d58f289332e v3.19-rc7 Fail 37866: 26bc420b59a38e4e6685a73345a0def461136dce v3.19-rc6 Fail 37868: ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc v3.19-rc5 Fail 37864: eaa27f34e91a14cdceed26ed6c6793ec1d186115 v3.19-rc4 Fail * 37867: b1940cd21c0f4abdce101253e860feff547291b0 v3.19-rc3 Pass * 37865: b7392d2247cfe6771f95d256374f1a8e6a6f48d6 v3.19-rc2 Pass 37863: 97bf6af1f928216fd6c5a66e8a57bfa95a659672 v3.19-rc1 Pass 37861: b2776bf7149bddd1f4161f14f79520f17fc1d71d v3.18 Pass I have set the adhoc bisector working on the ~200 commits between rc3 and rc4. It's running in the Citrix instance (which is quieter) so the interim results are only visible within our network at http://osstest.xs.citrite.ne t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 -xl..html. So far it has confirmed the basis fail and it is now rechecking the basis pass. Slightly strange though is: $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ include/xen/ $ i.e. there are no relevant seeming xen commits in that range. Maybe the last one of this is more relevant? $ git log --grep=[xX][eE][nN] --oneline v3.19-rc3..v3.19-rc4 -- bdec419 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 07ff890 xen-netback: fixing the propagation of the transmit shaper timeout 132978b x86: Fix step size adjustment during initial memory mapping $ I don't think this particular issue is prone to false positives (i.e. passing when it should fail) and the bisector has reconfirmed the fail case already, so I think it is unlikely that the bisector is going to come back and say it can't find a reliable basis for running. Which might mean we have two issues, some as yet unknown issue between v3.19-rc3 and -rc4 and the issue you have observed with the number of pages the toolstack thinks it should be working on, which is masked by the unknown issue (and could very well be a toolstack bug exposed by a change in Linux, not a Linux bug at all). I'm going to leave the bisector going, hopefully it'll tell us something interesting in whatever it fingers... Ian. > > Note that all guests have 512MB memory, which means they have 131072 > pages. > > Both 3.14 tests [2] [3] get the correct number of pages. Like: > > xc: detail: max_pfn 0x1ffff, p2m_frames 256 > ... > xc: detail: Memory: 2048/131072 1% > ... > > However in both 4.1 [0] [1] the number of pages are quite wrong. > > 4.1 32 bit: > > xc: detail: max_pfn 0xfffff, p2m_frames 1024 > ... > xc: detail: Memory: 11264/1048576 1% > ... > > It thinks it has 4096MB memory. > > 4.1 64 bit: > > xc: detail: max_pfn 0x3ffff, p2m_frames 512 > ... > xc: detail: Memory: 3072/262144 1% > ... > > It thinks it has 1024MB memory. > > The total number of pages is determined in libxc by calling > xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > hypervisor. And that value is clearly touched by Linux in some way. > > I now think this is a bug in Linux kernel. The biggest suspect is the > introduction of linear P2M. If you think this is a bug in toolstack, > please let me know. > > I don't know why 4.1 64 bit [0] can still be successfully restored. I > don't have handy setup to experiment. The restore path doesn't show > enough information to tell anything. The thing I worry about is that > migration v2 somehow make the guest bigger than it should be. But that's > another topic. > > > Wei. > > [0] 4.1 kernel 64 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/60785/test-amd64-amd64 > -xl/16.ts-guest-saverestore.log > > [1] 4.1 kernel 32 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/60785/test-amd64-i386 > -xl/14.ts-guest-saverestore.log > > [2] 3.14 kernel 64 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/61263/test-amd64-amd64 > -xl/16.ts-guest-saverestore.log > > [3] 3.14 kernel 32 bit save restore: > http://logs.test-lab.xenproject.org/osstest/logs/61263/test-amd64-i386 > -xl/16.ts-guest-saverestore.log _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |