[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
  • From: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>
  • Date: Thu, 8 Nov 2018 14:48:40 +0000
  • Autocrypt: addr=sergey.dyasli@xxxxxxxxxx; keydata= xsFNBFtMVHEBEADc/hZcLexrB6vGTdGqEUsYZkFGQh6Z1OO7bCtM1go1RugSMeq9tkFHQSOc 9c7W9NVQqLgn8eefikIHxgic6tGgKoIQKcPuSsnqGao2YabsTSSoeatvmO5HkR0xGaUd+M6j iqv3cD7/WL602NhphT4ucKXCz93w0TeoJ3gleLuILxmzg1gDhKtMdkZv6TngWpKgIMRfoyHQ jsVzPbTTjJl/a9Cw99vuhFuEJfzbLA80hCwhoPM+ZQGFDcG4c25GQGQFFatpbQUhNirWW5b1 r2yVOziSJsvfTLnyzEizCvU+r/Ek2Kh0eAsRFr35m2X+X3CfxKrZcePxzAf273p4nc3YIK9h cwa4ZpDksun0E2l0pIxg/pPBXTNbH+OX1I+BfWDZWlPiPxgkiKdgYPS2qv53dJ+k9x6HkuCy i61IcjXRtVgL5nPGakyOFQ+07S4HIJlw98a6NrptWOFkxDt38x87mSM7aSWp1kjyGqQTGoKB VEx5BdRS5gFdYGCQFc8KVGEWPPGdeYx9Pj2wTaweKV0qZT69lmf/P5149Pc81SRhuc0hUX9K DnYBa1iSHaDjifMsNXKzj8Y8zVm+J6DZo/D10IUxMuExvbPa/8nsertWxoDSbWcF1cyvZp9X tUEukuPoTKO4Vzg7xVNj9pbK9GPxSYcafJUgDeKEIlkn3iVIPwARAQABzShTZXJnZXkgRHlh c2xpIDxzZXJnZXkuZHlhc2xpQGNpdHJpeC5jb20+wsGOBBMBCgA4FiEEkI7HMI5EbM2FLA1L Aa+w5JvbyusFAltMVHECGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQAa+w5JvbyuuQ JBAAry/oRK6m0I+ck1Tarz9a1RrF73r1YoJUk5Bw+PSxsBJOPp3vDeAz3Kqw58qmBXeNlMU4 1cqAxFxCCKMtER1gpmrKWBA1/H1ZoBRtzhaHgPTQLyR7LB1OgdpgwEOjN1Q5gME8Pk21y/3N cG5YBgD/ZHbq8nWS/G3r001Ie3nX55uacGk/Ry175cS48+asrerShKMDNMT1cwimo9zH/3Lm RTpWloh2dG4jjwtCXqB7s+FEE5wQVCpPp9p55+9pPd+3DXmsQEcJ/28XHo/UJW663WjRlRc4 wgPwiC9Co1HqaMKSzdPpZmI5D4HizWH8jF7ppUjWoPapwk4dEA7Al0vx1Bz3gbJAL8DaRgQp H4j/16ifletfGUNbHJR2vWljZ5SEf2vMVcdubf9eFUfBF/9OOR1Kcj1PISP8sPhcP7oCfFtH RcxXh1OStrRFtltJt2VlloKXAUggdewwyyD4xl9UHCfI4lSexOK37wNSQYPQcVcOS1bl4NhQ em6pw2AC32NsnQE5PmczFADDIpWhO/+WtkTFeE2HHfAn++y3YDtKQd7xes9UJjQNiGziArST l6Zrx4/nShVLeYRVW76l27gI5a8BZLWwBVRsWniGM50OOJULvSag7kh+cjsrXXpNuA4rfEoB Bxr7pso9e5YghupDc8XftsYd7mlAgOTCAC8uZmfOwU0EW0xUcQEQAMKi97v3DwwPgYVPYIbQ JAvoMgubJllC9RcE0PQsE6nEKSrfOT6Gh5/LHOXLbQI9nzU/xdr6kMfwbYVTnZIY/SwsLrJa gSKm64t11MjC1Vf03/sncx1tgI7nwqMMIAYLsXnQ9X/Up5L/gLO2YDIPxrQ6g4glgRYPT53i r6/hTz3dlpqyPCorpuF+WY7P2ujhlFlXCAaD6btPPM/9LZSmI0xS4aCBLH+pZeCr0UGSMhsX JYN0QRLjfsIDGyqaXVH9gwV2Hgsq6z8fNPQlBc3IpDvfXa1rYtgldYBfG521L3wnsMcKoFSr R5dpH7Jtvv5YBuAk8r571qlMhyAmVKiEnc+RonWl503D5bAHqNmFNjV248J5scyRD/+BcYLI 2CFG28XZrCvjxq3ux5hpmg2fCu+y98h6/yuwB/JhbFlDOSoluEpysiEL3R5GTKbxOF664q5W fiSObxNONxs86UtghqNDRUJgyS0W6TfykGOnZDVYAC9Gg8SbQDta1ymA0q76S/NG2MrJEOIr 1GtOr/UjNv2x4vW56dzX/3yuhK1ilpgzh1q504ETC6EKXMaFT8cNgsMlk9dOvWPwlsIJ249+ PizMDFGITxGTIrQAaUBO+HRLSBYdHNrHJtytkBoTjykCt7M6pl7l+jFYjGSw4fwexVy0MqsD AZ2coH82RTPb6Q7JABEBAAHCwXYEGAEKACAWIQSQjscwjkRszYUsDUsBr7Dkm9vK6wUCW0xU cQIbDAAKCRABr7Dkm9vK6+9uD/9Ld3X5cvnrwrkFMddpjFKoJ4yphtX2s+EQfKT6vMq3A1dJ tI7zHTFm60uBhX6eRbQow8fkHPcjXGJEoCSJf8ktwx/HYcBcnUK/aulHpvHIIYEma7BHry4x L+Ap7oBbBNiraS3Wu1k+MaX07BWhYYkpu7akUEtaYsCceVc4vpYNITUzPYCHeMwc5pLICA+7 VdI1rrTSAwlCtLGBt7ttbvaAKN4dysiN+/66Hlxnn8n952lZdG4ThPPzafG50EgcTa+dASgm tc6HaQAmJiwb4iWUOoUoM+udLRHcN6cE0bQivyH1bqF4ROeFBRz00MUJKvzUynR9E50F9hmd DOBJkyM3Z5imQ0RayEkRHhlhj7uECaojnUeewq4zjpAg2HTSMkdEzKRbdMEyXCdQXFnSCmUB 5yMIULuDbOODWo3EufExLjAKzIRWEKQ/JidLzO6hrhlQffsJ7MPTU+Hg7WxqWfn4zhuUcIQB SlkiRMalSiJITC2jG7oQRRh9tyNaDMkKzTbeFtHKRmUUAuhE0LBXP8Wc+5W7b3WOf2SO8JMR 4TqDZ0K06s66S5fOTW0h56iCCxTsAnRvM/tA4SERyRoFs/iTqJzboskZY0yKeWV4/IQxfOyC YwdU3//zANM1ZpqeE/8lnW/kx+fyzVyEioLSwkjDvdG++4GQ5r6PHQ7BbdEWhA==
  • Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx, Julien Grall <julien.grall@xxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Delivery-date: Thu, 08 Nov 2018 14:48:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

(CCing Roger)

On 08/11/2018 11:07, Andrew Cooper wrote:
> On 08/11/18 10:31, Jan Beulich wrote:
>>>>> On 07.11.18 at 19:20, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 09/10/18 16:21, Sergey Dyasli wrote:
>>>> Scrubbing RAM during boot may take a long time on machines with lots
>>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>>>> initially so they will eventually be scrubbed in idle-loop on every
>>>> online CPU.
>>>>
>>>> It's guaranteed that the allocator will return scrubbed pages by doing
>>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>>>
>>>> Use the new 'idle' option as the default one.
>>>>
>>>> Signed-off-by: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>
>>> This patch reliably breaks boot, although its not immediately obvious how:
>>>
>>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
>>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
>>> 60 is not supported
>>> (d9) (XEN) Dom0 has maximum 400 PIRQs
>>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>>> (d9) (XEN) CPU:    0
>>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
>>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 
>>> 0000000000000000
>>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: 
>>> ffff83000045c24b
>>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  
>>> ffff83003f057000
>>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 
>>> 0000000000000001
>>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: 
>>> ffff82d0805f33d0
>>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 
>>> 00000000001526e0
>>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
>>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 
>>> 0000000000000000
>>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>> (d9) (XEN) Xen code around <ffff82d080440ddb> 
>>> (setup.c#cmdline_cook+0x1d/0x77):
>>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 
>>> 74 f7 80 3d
>>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>>> [...]
>>> (d9) (XEN) Xen call trace:
>>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
>>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
>> That's apparently the 2nd cmdline_cook() invocation, when producing
>> the Dom0 command line. I would suppose what "loader" points to has
>> been scrubbed by the time we get there (with synchronous scrubbing
>> APs wouldn't be able to get going with this before reaching
>> heap_init_late()).
> 
> This is via a PVH boot (like a lot of my development work), and does
> look to be a latent use-after-free.  Dropping the VM down to a single
> vcpu causes the problem to go away.
> 
> Sergey is kindly investigating.

Yes, this seems to be a bug in Xen PVH boot path. From the serial:

(XEN) == mbi->mods_addr 0x46dce0

which is marked as usable in e820:

(XEN) PVH-e820 RAM map:
(XEN)  0000000000000000 - 00000000000a0000 (usable)
(XEN)  0000000000100000 - 0000000040000400 (usable)
(XEN)  00000000fc000000 - 00000000fc009040 (ACPI data)
(XEN)  00000000feff8000 - 00000000feffc000 (reserved)
(XEN)  00000000feffc000 - 00000000feffd000 (usable)
(XEN)  00000000feffd000 - 00000000ff000000 (reserved)

This memory is then given to the allocator and scrubbed by secondary
CPUs which leads to use-after-free. Even with fixing the cmdline issue,
another FATAL PAGE FAULT occurs further down the boot path:

(d16) [183465.829440] (XEN) Xen call trace:
(d16) [183465.829467] (XEN)    [<ffff82d08023d6c5>] memcmp+0x9/0x3a
(d16) [183465.829494] (XEN)    [<ffff82d080436702>]
bzimage.c#bzimage_check+0x32/0x71
(d16) [183465.829511] (XEN)    [<ffff82d080436806>] bzimage_parse+0x22/0xba
(d16) [183465.829528] (XEN)    [<ffff82d080431086>]
dom0_build.c#pvh_load_kernel+0x82/0x3c0
(d16) [183465.829612] (XEN)    [<ffff82d0804316e0>]
dom0_construct_pvh+0x1c9/0x11bf
(d16) [183465.829638] (XEN)    [<ffff82d0804387a6>]
construct_dom0+0xd4/0xb0e
(d16) [183465.829655] (XEN)    [<ffff82d0804280cc>]
__start_xen+0x2631/0x28b6
(d16) [183465.829682] (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
...
(XEN) Faulting linear address: ffff8f2c2d301202

Looking at mod[0].pa in PVH start info, I suspect that it also gets
overwritten:

(XEN) PVH start info: (pa 0000ffc0)
(XEN)   version:    1
(XEN)   flags:      0
(XEN)   nr_modules: 1
(XEN)   modlist_pa: 000000000000ff70
(XEN)   cmdline_pa: 000000000000ff90
(XEN)   cmdline:    'console=xen,pv dom0=pvh xsm=flask'
(XEN)   rsdp_pa:    00000000fc009000
(XEN)     mod[0].pa:         00000000005b1000
(XEN)     mod[0].size:       0000000004784128
(XEN)     mod[0].cmdline_pa: 0000000000000000

The issue is easily reproduced by running Xen as a PVH guest with the
following config:

type="pvh"

vcpus=2
memory=1024
nestedhvm=1

kernel="/root/xen-syms"
ramdisk="/boot/vmlinuz-4.4.0+10"
cmdline="console=xen,pv dom0=pvh xsm=flask"

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.