[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Improving domU restore time



Hello,
> I would be grateful for the comments on possible methods to improve domain
> restore performance. Focusing on the PV case, if it matters.
Continuing the topic; thank you to everyone that responded so far.

Focusing on xen-3.4.3 case for now, dom0/domU still 2.6.32.x pvops x86_64. 
Let me just reiterate that for our purposes, the domain save time (and 
possible related post-processing) is not critical, it 
is only the restore time that matters. I did some experiments; they involve:
1) before saving a domain, have domU allocate all free memory in an userland
process, then fill it with some MAGIC_PATTERN. Save domU, then process the
savefile, removing all pfns (and their page content) that refer to a page 
containing MAGIC_PATTERN.
This reduces the savefile size.
2) instead of executing "xm restore savefile", just poke the xmlrpc request
to Xend unix socket via socat
3) change the /etc/xen/scripts/block so that in the "add file:" case, it calls
only 3 processes (xenstore-read, losetup, xenstore-write); assuming the
sharing check can be done elsewhere, this should provide realistic lower
bound for the execution time

For a domain with 400MB RAM and 4 vbds, with the savefile in the fs cache, 
this cuts down the restore real time from 2700 ms to 1153 ms. Some questions:
a) is the 1) method safe ? Normally, xc_domain_restore() allocates mfns via 
xc_domain_memory_populate_physmap() and then calls 
xc_add_mmu_update(MMU_MACHPHYS_UPDATE) on
the pfn/mfn pairs. If we remove some pfns from the savefile, this will not
happen. Instead, the mfn for the removed pfn (referring to memory whose
content we don't care for) will be allocated in uncanonicalize_pagetable(),
because there will be a pte entry for this page. But uncanonicalize_pagetable()
does not call xc_add_mmu_update(). Still, the domain seems to be restored 
properly (naturally the buffer filled previously with MAGIC_PATTERN now 
contains junk, but this is the whole purpose of it).
Again, is xc_add_mmu_update(MMU_MACHPHYS_UPDATE) really needed in the above
scenario ? It basically does
set_gpfn_from_mfn(mfn, gpfn)
but this should already be taken care for by 
xc_domain_memory_populate_physmap() ?

b) There still seems to be some discrepancy between the real time (1153ms) and
the CPU time (970ms); considering this is a machine with 2 cores (and at
least the hotplug scripts execute in parallel), it is notable. What can cause 
the involved processes to sleep (we read the savefile from fs cache, so there 
should be no disk reads at all). Is the single threaded nature of xenstored 
the possible cause for the delays ?
Generally xenstored seems to be quite busy during the restore. Do you think
some of the queries (from Xend?) are redundant ? Is there anything else
that can be removed from the relevant Xend code with no harm ? This question
may sound too blunt; but given the fact that "xm restore savefile" wastes 220
ms of CPU time doing apparently nothing useful, I would assume there is some
overhead in Xend too. 
The systemtap trace in the attachment; it does not contain a line about the 
xenstored CPU ticks (259ms, really a lot?), as xenstored does not terminate 
any thread. 

c) 
>> Also, it looks really excessive that basically copying 400MB of memory takes
>> over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its
> I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of
> that loop.
Let's imagine there is a hypercall do_direct_memcpy_from_dom0_to_mfn(int
mfn_count, mfn* mfn_array, char * pages_content).
Would it make xc_restore faster if instead of using the xc_map_foreign_batch()
interface, it would call the above hypercall ? On x86_64 all the physical
memory is already mapped in the hypervisor (is this correct?), so this could 
be quicker, as no page table setup would be necessary ?

Regards,
Rafal Wojtczuk
Principal Researcher
Invisible Things Lab, Qubes-os project

Attachment: probe.systemtap
Description: Text document

Attachment: probeoutput.txt
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.