[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] shadow2 corrupting PV guest state
Hi, You (jeremy) said: > I've been fighting random crashes in the paravirt tree for a while. > After a fair amount of head-banging, it looks to me like the shadow2 > code is trashing the guest stack (and maybe register state) at random > points. I have a question about shadow2 in another point of view. I've been porting PV-on-HVM driver for ia64 platform. In my jobs, I had a doubt that shadow2 might occur a problem of memory corruption. At first, I had found the problem as a hypervisor crash during destruction of HVM domain with active VNIF on ia64 platform. The reason of crash was that hypervisor detected P2M table used by gnttab_copy in the HVM domain destruction. Thus I looked for a way to avoid hypervisor crash in x86 code. So, I found that: * Before shadow2 age, x86 and ia64 use same logic for domain destruction. - at first, release gnttab references - destruct page table for VCPU - destruct P2M table for domain - relinquish memory for domain * After shadow2 age, x86 introduces delayed P2M table destruction. - release gnttab references - destruct page table for VCPU - relinquish memory for domain - destruct P2M table for domain in domain_destroy() *** I don't have confidence in my investigation. *** Am I right ? I try to show the code that... [common/domain.c] 203 void domain_kill(struct domain *d) 204 { 205 domain_pause(d); 206 207 if ( test_and_set_bit(_DOMF_dying, &d->domain_flags) ) 208 return; 209 210 gnttab_release_mappings(d); 211 domain_relinquish_resources(d); 212 put_domain(d); 213 214 send_guest_global_virq(dom0, VIRQ_DOM_EXC); 215 } [arch/x86/domain.c] 930 void domain_relinquish_resources(struct domain *d) 931 { 932 struct vcpu *v; 933 unsigned long pfn; .... 937 /* Drop the in-use references to page-table bases. */ 938 for_each_vcpu ( d, v ) .... 979 /* Relinquish every page of memory. */ 980 relinquish_memory(d, &d->xenpage_list); 981 relinquish_memory(d, &d->page_list); .... This is the code for domain_kill phase. I think that hypervisor relinquishes memory for domain in this code. In the other hand... [common/domain.c] 322 /* Release resources belonging to task @p. */ 323 void domain_destroy(struct domain *d) 324 { 325 struct domain **pd; 326 atomic_t old, new; .... 354 arch_domain_destroy(d); 355 356 free_domain(d); 357 358 send_guest_global_virq(dom0, VIRQ_DOM_EXC); 359 } [arch/x86/domain.c] 237 void arch_domain_destroy(struct domain *d) 238 { 239 shadow_final_teardown(d); .... [arch/x86/mm/shadow/common.c] 2580 void shadow_final_teardown(struct domain *d) 2581 /* Called by arch_domain_destroy(), when it's safe to pull down the p2m map. */ 2582 { .... 2597 /* It is now safe to pull down the p2m map. */ 2598 if ( d->arch.shadow.p2m_pages != 0 ) 2599 shadow_p2m_teardown(d); In this code, P2M table are released. If my speculation is correct, shadow2 may occur a problem of memory corruption. What do you think about this point ? Thanks, - Tsunehisa Doi _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |