[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Weird altp2m behaviour when switching early to a new view



On 04/17/2018 11:24 AM, Razvan Cojocaru wrote:
> On 04/16/2018 11:21 PM, George Dunlap wrote:
>> On Mon, Apr 16, 2018 at 7:46 PM, Razvan Cojocaru
>> <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>> On 04/16/2018 08:47 PM, George Dunlap wrote:
>>>> On 04/13/2018 03:44 PM, Razvan Cojocaru wrote:
>>>>> On 04/11/2018 11:04 AM, Razvan Cojocaru wrote:
>>>>>> Debugging continues.
>>>>>
>>>>> Finally, the attached patch seems to get the display unstuck in my
>>>>> scenario, although for one guest I get:
>>>>>
>>>>> (XEN) d2v0 Unexpected vmexit: reason 49
>>>>> (XEN) domain_crash called from vmx.c:4120
>>>>> (XEN) Domain 2 (vcpu#0) crashed on cpu#1:
>>>>> (XEN) ----[ Xen-4.11-unstable  x86_64  debug=y   Not tainted ]----
>>>>> (XEN) CPU:    1
>>>>> (XEN) RIP:    0010:[<fffff96000842354>]
>>>>> (XEN) RFLAGS: 0000000000010246   CONTEXT: hvm guest (d2v0)
>>>>> (XEN) rax: fffff88003000000   rbx: fffff900c0083db0   rcx: 
>>>>> 00000000aa55aa55
>>>>> (XEN) rdx: fffffa80041bdc41   rsi: fffff900c00c69a0   rdi: 
>>>>> 0000000000000001
>>>>> (XEN) rbp: 0000000000000000   rsp: fffff88002ee9ef0   r8:  
>>>>> fffffa80041bdc40
>>>>> (XEN) r9:  fffff80001810e80   r10: fffffa800342aa70   r11: 
>>>>> fffff88002ee9e80
>>>>> (XEN) r12: 0000000000000005   r13: 0000000000000001   r14: 
>>>>> fffff900c00c08b0
>>>>> (XEN) r15: 0000000000000001   cr0: 0000000080050031   cr4: 
>>>>> 00000000000406f8
>>>>> (XEN) cr3: 00000000ef771000   cr2: fffff900c00c8000
>>>>> (XEN) fsb: 00000000fffde000   gsb: fffff80001810d00   gss: 
>>>>> 000007fffffdc000
>>>>> (XEN) ds: 002b   es: 002b   fs: 0053   gs: 002b   ss: 0018   cs: 0010
>>>>>
>>>>> i.e. EXIT_REASON_EPT_MISCONFIG - so not of the woods yet. I am hoping
>>>>> somebody more familiar with the code can point to a more elegant
>>>>> solution if one exists.
>>>>
>>>> I think I have an idea what's going on, but it's complicated. :-)
>>>>
>>>> Basically, the logdirty functionality isn't simple, and needs careful
>>>> thought on how to integrate it.  I'll write some more tomorrow, and see
>>>> if I can come up with a solution.
>>>
>>> I think I know why this happens for the one guest - the other guests
>>> start at a certain resolution display-wise and stay that way until shutdown.
>>>
>>> This particular guest starts with a larger screen, then goes to roughly
>>> 2/3rds of it, then tries to go back to the initial larger one - at which
>>> point the above happens. I assume this corresponds to some pages being
>>> removed and/or added. I'll test this theory more tomorrow - if it's
>>> correct I should be able to reproduce the crash (with the patch) by
>>> simply resetting the screen resolution (increasing it).
>>
>> The trick is that p2m_change_type doesn't actually iterate over the
>> entire p2m range, individually changing entries as it goes.  Instead
>> it misconfigures the entries at the top-level, which causes the kinds
>> of faults shown above.  As it gets faults for each entry, it checks
>> the current type, the logdirty ranges, and the global logdirty bit to
>> determine what the new types should be.
>>
>> Your patch makes it so that all the altp2ms now get the
>> misconfiguration when the logdirty range is changed; but clearly
>> handling the misconfiguration isn't integrated properly with the
>> altp2m system yet.  Doing it right may take some thought.
> 
> FWIW, the attached patch has solved the misconfig-related domain crash
> for me (though I'm very likely missing some subtleties). It all seems to
> work as expected when enabling altp2m and switching early to a new view.
> However, now I have domUs with a frozen display when I disconnect the
> introspection application (that is, after I switch back to the default
> view and disable altp2m on the domain).

The for() loop in the previous patch is unnecessary, so here's a new
(cleaner) patch. I can't get the guest to freeze the display when
detaching anymore - unrelated to the for() - (so it might have been
something else in my setup), but I'll watch for it in the following days.

Hopefully this is either a reasonable fix or a basis for one.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.