[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fix VGA logdirty related display freezes with altp2m


  • To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, Tamas K Lengyel <tamas.k.lengyel@xxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 22 Oct 2018 22:22:01 +0100
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABzSlBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPsLBegQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86M7BTQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAcLB XwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 22 Oct 2018 21:22:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 22/10/2018 22:17, Razvan Cojocaru wrote:
> On 10/22/18 11:48 PM, Tamas K Lengyel wrote:
>> On Thu, Oct 18, 2018 at 3:12 PM Razvan Cojocaru
>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>> On 10/18/18 11:08 PM, Tamas K Lengyel wrote:
>>>> On Thu, Oct 18, 2018 at 4:09 AM Razvan Cojocaru
>>>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>> Hello,
>>>>>
>>>>> This series aims to prevent the display from freezing when
>>>>> enabling altp2m and switching to a new view (and assorted problems
>>>>> when resizing the display).
>>>>>
>>>>> The first patch propagates ept.ad changes to all active altp2ms,
>>>>> and the second one allocates a new logdirty rangeset for each
>>>>> new altp2m, and propagates (under lock) changes to all p2ms.
>>>>>
>>>>> The first patch is the same as:
>>>>> [PATCH V4] x86/altp2m: propagate ept.ad changes to all active altp2ms
>>>>> but as it is now required for the second one to apply cleanly, it
>>>>> has been resent as part of this series.
>>>>>
>>>>> [PATCH 1/2] x86/altp2m: propagate ept.ad changes to all active altp2ms
>>>>> [PATCH 2/2] x86/altp2m: fix display frozen when switching to a new
>>>> Hi Razvan,
>>>> I would be happy to give this a spin, can you push it as a git branch 
>>>> somewhere?
>>> Sure, here you go:
>>>
>>> https://github.com/razvan-cojocaru/xen/tree/altp2m-logdirty-take1
>> I ran into this crash when my config incorrectly pointed to a
>> non-valid disk location:
>>
>> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475
>> (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>> (XEN) CPU:    4
>> (XEN) RIP:    e008:[<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b
>> (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
>> (XEN) rax: ffff83046d27802c   rbx: ffff8304558dd880   rcx: 0000000000000000
>> (XEN) rdx: ffff83046d277fff   rsi: 00000000004680c0   rdi: 0000000000000000
>> (XEN) rbp: ffff83046d277d60   rsp: ffff83046d277d50   r8:  ffff82d0809304a0
>> (XEN) r9:  0000000000455940   r10: ffff82e008d01000   r11: 0000000000000017
>> (XEN) r12: ffff8304558dd880   r13: ffff8304558df830   r14: ffff8304558df000
>> (XEN) r15: fffffffffffffff8   cr0: 000000008005003b   cr4: 00000000003526e0
>> (XEN) cr3: 000000005da16000   cr2: ffff880456cd6e80
>> (XEN) fsb: 0000000000000000   gsb: ffff880467f40000   gss: 0000000000000000
>> (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen code around <ffff82d08033f40c> (p2m_uninit_altp2m_ept+0x29/0x2b):
>> (XEN)  00 48 83 c4 08 5b 5d c3 <0f> 0b 55 48 89 e5 41 56 41 55 41 54 53 48 
>> 8d 05
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d08033f40c>] p2m_uninit_altp2m_ept+0x29/0x2b
>> (XEN)    [<ffff82d0803305ab>] p2m.c#p2m_teardown_altp2m+0x36/0x52
>> (XEN)    [<ffff82d0803331b5>] p2m_final_teardown+0x11/0x28
>> (XEN)    [<ffff82d08034509c>] paging_final_teardown+0x2e/0x3c
>> (XEN)    [<ffff82d080276439>] arch_domain_destroy+0x50/0xa1
>> (XEN)    [<ffff82d08020595c>] domain.c#complete_domain_destroy+0x86/0x159
>> (XEN)    [<ffff82d080228f4f>] rcupdate.c#rcu_process_callbacks+0xa5/0x1cf
>> (XEN)    [<ffff82d08023ae6b>] softirq.c#__do_softirq+0x71/0x9a
>> (XEN)    [<ffff82d08023aede>] do_softirq+0x13/0x15
>> (XEN)    [<ffff82d080275068>] domain.c#idle_loop+0x63/0xb9
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 4:
>> (XEN) Assertion 'p2m->sync.logdirty_ranges' failed at p2m-ept.c:1475
>> (XEN) ****************************************
> Right, that one I've also come across now, that will be fixed in the
> next series as a result of doing what Andrew has suggested, which is to say:
>
> "Please make all destroy functions idempotent.  i.e.
>
> if ( p2m->sync.logdirty_ranges )
> {
>     rangeset_destroy(p2m->sync.logdirty_ranges);
>     p2m->sync.logdirty_ranges = NULL;
> }
>
> and use this destroy function in the cleanup path of init()."

Indeed.

>
>> With the config fixed it boots but when I run DRAKVUF on the domain I
>> get the following crash:
>>
>> (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e008:[<000000007bdb630c>] 000000007bdb630c
>> (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor (d0v5)
>> (XEN) rax: 00000000ee138470   rbx: 0000000000000000   rcx: 000000008000b098
>> (XEN) rdx: 0000000000000cf8   rsi: 0000000000000000   rdi: 000000046d2ef000
>> (XEN) rbp: 0000000000000000   rsp: ffff83005da27a10   r8:  0000000000000cf8
>> (XEN) r9:  0000000000000cf8   r10: ffff83005da27ab8   r11: ffff83005da27a08
>> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000065
>> (XEN) r15: 00000000000005a7   cr0: 0000000080050033   cr4: 0000000000372660
>> (XEN) cr3: 000000046d2ef000   cr2: 00000000ee138470
>> (XEN) fsb: 00007fe46d97bbc0   gsb: ffff880467f40000   gss: 0000000000000000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen code around <000000007bdb630c> (000000007bdb630c):
>> (XEN)  80 74 0b 05 70 84 00 00 <c7> 00 00 00 00 e0 80 3d 7a 34 00 00 00 75 
>> 64 48
>> (XEN) Xen stack trace from rsp=ffff83005da27a10:(XEN) Xen stack trace
>> from rsp=ffff83005da27a10:
>> (XEN)    0000000000000000 0000000000000065 ffff83005da27a50 ffff82d08037aafc
>> (XEN)    00000000fffffffe ffff82d08037ae14 0000000000000000 ffff83005da27a90
>> (XEN)    0000000000372660 000000046d2ef000 0000000393e91000 ffff82d0809602b0
>> (XEN)    000000fe00000000 ffff82d0802a3b98 ffffffffffffffff ffff83005da27ab8
>> (XEN)    ffff83005da27b08 ffff82d0802a3511 ffff82d08046b028 ffff83005da27b08
>> (XEN)    ffff82d0802a3511 ffff83005da27fff 0000138800000292 000082d0808176a0
>> (XEN)    0000000000000000 ffff82d08023b889 0000000000000292 ffff82d08046b028
>> (XEN)    ffff82d080451ac8 ffff82d080454af2 00000000000005a7 ffff83005da27b78
>> (XEN)    ffff82d080251d6f ffff82d080250fcd 0000000000000028 ffff83005da27b88
>> (XEN)    ffff83005da27b38 000000000000e010 ffff82d080454c73 ffff82d080451ac8
>> (XEN)    ffff82d080454af2 00000000000005a7 0000000000000030 ffff83005da27bf8
>> (XEN)    ffff82d080454c73 ffff83005da27be8 ffff82d0802aaebc ffff82d08033f3dc
>> (XEN)    ffff82d080451ac8 ffff82d08037d969 ffff82d08037d95d ffff82d08037d969
>> (XEN)    0b0f82d08037d95d ffff82d08037d969 ffff83005fe5b000 0000000000000000
>> (XEN)    0000000000000000 ffff83005da27fff 0000000000000000 00007cffa25d83e7
>> (XEN)    ffff82d08037da2d deadbeefdeadf00d ffff83018caf2530 ffff83005da27d38
>> (XEN)    ffff83040a492830 ffff83005da27cc8 ffff83040bab2880 0000000000000000
>> (XEN)    0000000000000000 deadbeefdeadf00d deadbeefdeadf00d 0000000000000000
>> (XEN)    0000000000000000 ffff830451835000 0000000000000000 ffff83040a492000
>> (XEN)    0000000600000000 ffff82d08033f3da 000000000000e008 0000000000010282
>> (XEN) Xen call trace:
>> (XEN)    [<000000007bdb630c>] 000000007bdb630c
>> (XEN)
>> (XEN) Pagetable walk from 00000000ee138470:
>> (XEN)  L4[0x000] = 000000046d2ee063 ffffffffffffffff
>> (XEN)  L3[0x003] = 000000005da11063 ffffffffffffffff
>> (XEN)  L2[0x170] = 0000000000000000 ffffffffffffffff
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) FATAL PAGE FAULT
>> (XEN) [error_code=0002]
>> (XEN) Faulting linear address: 00000000ee138470
>> (XEN) ****************************************
>> (XEN)
>> (XEN) Reboot in five seconds...
> This one I'm not sure about. What does your introspection agent do at
> that point?

This crash is bizarre.  Xen has most likely followed a corrupt function
pointer, because none of Xen's .text section live just below the 2G boundary

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.