[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen crash with mem-sharing and cloning





On Tue, Mar 24, 2015 at 1:48 AM, Tamas K Lengyel <tklengyel@xxxxxxxxxxxxx> wrote:


On Tue, Mar 24, 2015 at 4:54 AM, Andres Lagar Cavilla <andres@xxxxxxxxxxxxxxxx> wrote:


On Mon, Mar 23, 2015 at 11:25 AM, Tamas K Lengyel <tklengyel@xxxxxxxxxxxxx> wrote:
On Mon, Mar 23, 2015 at 6:59 PM, Andres Lagar Cavilla <andres@xxxxxxxxxxxxxxxx> wrote:
On Mon, Mar 23, 2015 at 9:10 AM, Tamas K Lengyel <tklengyel@xxxxxxxxxxxxx> wrote:
Hello everyone,
I'm trying to chase down a bug that reproducibly crashes Xen (tested with 4.4.1). The problem is somewhere within the mem-sharing subsystem and how that interacts with domains that are being actively saved. In my setup I use the xl toolstack to rapidly create clones of HVM domains by piping "xl save -c" into xl restore with a modified domain config which updates the name/disk/vif. However, during such an operation Xen crashes with the following log if there are already active clones.

IMHO there should be no conflict between saving the domain and memsharing, as long as the domain is actually just being checkpointed "-c" - it's memory should remain as is. This is however clearly not the case. Any ideas?

Tamas, I'm not clear on the use of memsharing in this workflow. As described, you pipe save into restore, but the internal magic is lost on me. Are you fanning out to multiple restores? That would seem to be the case, given the need to update name/disk/vif.

Anyway, I'm inferring. Instead, could you elaborate?

Thanks
Andre

Hi Andre,
thanks for getting back on this issue. The script I'm using is at https://github.com/tklengyel/drakvuf/blob/master/tools/clone.pl. The script simply creates a FIFO pipe (mkfifo) and saves the domain into that pipe which is immediately read by xl restore with the updated configuration file. This mainly just to eliminate having to read the memory dump from disk. That part of the system works as expected and multiple save/restores running at the same time don't cause any side-effects. Once the domain has thus been cloned, I run memshare on every page which also works as expected. This problem only occurs when the cloning procedure runs when a page unshare operation kicks in on a already active clone (as you see in the log).

Sorry Tamas, I'm a bit slow here, I looked at your script -- looks allright, no mention of memsharing in there.

Re-reading ... memsharing? memshare? Is this memshrtool in tools/testing? How are you running it?


Hi Andre,
the memsharing happens here https://github.com/tklengyel/drakvuf/blob/master/src/main.c#L144 after the clone script finished. This is effectively the same approach as in tools/testing, just automatically looping from 0 to max_gpfn. Afterwards all unsharing happens automatically either induced by the guest itself, or when I map pages into the my app with xc_map_foreign_range PROT_WRITE.

Thanks. Couple of observations on your script
1. sharing all gfns from zero to max is inefficient. There are non trivial holes in the physmap space that you want to jump over. (Holes are not the cause of the crash)
2. xc_memshr_add_to_physmap was created exactly for this case. Rather than deduplicating two pages into one, it grafts a sharing-nominated page directly onto an otherwise empty p2m entry. Apart from the obvious overhead reduction benefit, it does not require you to have 2x memory capacity in order to clone a VM.
Â
Â

Certainly no xen crash should happen with user-space input. I'm just trying to understand what you're doing. The unshare code is not, uhmm, brief, so a NULL deref could happen in half a dozen places at first glance.

Well let me know what I could do help tracing it down. I don't think (potentially buggy) userspace tools should crash Xen either =)

From the crash a writable foreign map (qemu -- assuming you run your memshare tool strictly after xl restore has finished) is triggering the unshare NULL deref. My main suspicion is the rmap becoming racy. I would liberally sprinkle printks, retry, see how far printks say you got.

Andres
Â

Tamas
Â

Thanks
Andres



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.