[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH][VT] Patch to allow VMX domains to be destroyed or shut down cleanly
Keir Fraser wrote: >I mean forcibly decrement them to zero and free them right there and >then. Of course, as you point out, the problem is that some of the >pages are mapped in domain0. I'm not sure how we can distinguish >tainted refcnts from genuine external references. Perhaps there's a >proper way we should be destructing the full shadow pagetables such >that the refcnts end up at zero. Thanks for your comment. I have done extensive tracing through the domain destruction code in the hypervisor in the last few days. The bottom line: after domain destruction code in the hypervisor is done, all shadow pages were indeed freed up - even though the shadow_tainted_refcnts flag was set. I now believe the remaining pages are genuinely externally referenced (possibly by the qemu device model still running in domain0). Here are more details on what I have found: Ideally, when we destroy or shut down a VMX domain, the general page reference counts ended up at 0 in shadow mode, so that the pages can be released properly from the domain. I have traced quite a bit of code for different scenarios involving Windows XP running in a VMX domain. I only did simple operations in Windows XP, but I tried to destroy the VMX domain at different times (e.g. during Windows XP boot, during simple operations, after Windows XP has been shutdown, etc.) For non-VMX (Linux) domains, after we relinquish memory in domain_relinquish_resources(), all pages in the domain's page list indeed had reference count of 0 and were properly freed from the xen heap - just like we expected. For VMX (e.g., Windows XP) domains, after we relinquish memory in domain_relinquish_resources(), depending on how many activities were done in Windows XP, there were anywhere from 2 to 100 pages remaining just before the domain's structures were freed up by the hypervisor. Most of these pages still have page reference counts of 1, and therefore, could not be freed from the heap by the hypervisor. This prevents the rest of the domain's resources from being released, and therefore, 'xm list' still shows the VMX domains after they were destroyed. In shadow mode, the following things could be reflected in the page (general) reference counts: (a) General stuff: - page is allocated (PGC_allocated) - page is pinned - page is pointed by CR3's (b) Shadow page tables (l1, l2, hl2, etc.) (c) Out-of-sync entries (d) Grant table mappings (e) External references (not through grant table) (f) Monitor page table references (external shadow mode) (g) Writable PTE predictions (h) GDTs/LDTs So I put in a lot of instrumentation and tracing code, and made sure that the above things were taken into account and removed from the page reference counts during the domain destruction code sequence in the hypervisor. During this code sequence, we disable shadow mode (shadow_mode_disable()) and the shadow_tainted_refcnts flag was set. However, much to my surprise, the page reference counts were properly taken care of in shadow mode, and all shadow pages (including those in l1, l2, hl2 tables and snapshots) were all freed up. In particular, here's where each of the things in the above list was taken into account during the domain destruction code sequence in the hypervisor: (a) General stuff: - None of remaining pages have PGC_allocated flag set - None of remaining pages are still pinned - The monitor shadow ref was 0, and all pages pointed to by CR3's were taken care of in free_shadow_pages() (b) All shadow pages (including those pages in l1, l2, hl2, snapshots) were freed properly. I implemented counters to track all shadow page promotions/allocations and demotions/ deallocations throughout the hypervisor code, and at the end after we relinquished all domain memory pages, these counters did indeed return to 0 - as we expected. (c) out-of-sync entries -> in free_out_of_sync_state() called by free_shadow_pages(). (d) grant table mappings -> the count of active grant table mappings is 0 after the domain destruction sequence in the hypervisor is executed. (e) external references not mapped via grant table -> I believe that these include the qemu-dm pages which still remain after we relinquish all domain memory pages - as the qemu-dm may still be active after a VMX domain has been destroyed. (f) external monitor page references -> all references from monitor page table are dropped in vmx_relinquish_resources(), and monitor table itself is freed in domain_destruct(). In fact, in my code traces, the monitor shadow reference count was 0 after the domain destruction code in the hypervisor. (g) writable PTE predictions -> I didn't see any pages in this category in my code traces, but if there are, they would be freed up in free_shadow_pages(). (h) GDTs/LDTs -> these were destroyed in destroy_gdt() and invalidate_shadow_ldt() called from domain_relinquish_ resources(). Based on the code instrumentation and tracing above, I am pretty confident that the shadow page reference counts were handled properly during the domain destruction code sequence in the hypervisor. There is a problem in keeping track of shadow page counts (domain->arch.shadow_page_count), and I will submit a patch to fix this shortly. However, this does not really impact how shadow pages are handled. Consequently, the pages that still remain after the domain destruction code sequence in the hypervisor are externally referenced and may belong to the qemu device model running in domain0. The fact that qemu-dm is still active for some time after a VMX domain has been torn down in the hypervisor is evident by examining the tools code (python). In fact, if I forcibly free these remaining pages from the xen heap, the system/dom0 crashed. Am I missing anything ? Your comments, suggestions, etc., are welcome! Thanks for reading this rather long email :-) Khoa H. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |