[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Live migration fails under heavy network use

On Wed, Feb 21, 2007 at 07:32:10AM +0000, Keir Fraser wrote:

> >>> Urk. Checkout line 204 of privcmd.c
> >>> That doesn't look too 64b clean to me....
> > 
> > Hmm, I never looked at the Linux privcmd driver before. It seems like
> > you're prefaulting all the pages in the ioctl() in on the ioctl.
> > Currently on Solaris we're demand-faulting each page in the mmap() ...
> > 
> > It looks like we're not checking that we ca actually map the page at the
> > time of the ioctl(), thus it's not ending up as marked with that bit.
> > 
> > Looks like our bug...
> If you check then you may as well map at the same time. I think it would
> actually be hard to defer the mapping in a race-free manner.

I've modified the segment driver to prefault the MFNs and things seem a
lot better for both Solaris and Linux domUs:

(XEN) /export/johnlev/xen/xen-work/xen.hg/xen/include/asm/mm.h:184:d0 Error pfn 
5512: rd=ffff830000f92100, od=0000000000000000, caf=00000000, 
(XEN) mm.c:590:d0 Error getting mfn 5512 (pfn 47fa) from L1 entry 
0000000005512705 for dom52
(XEN) mm.c:566:d0 Non-privileged (53) attempt to map I/O space 00000000 done

Not quite sure why the new domain is trying to map 00000000 though.

I also see a fair amount of:

Dom48 freeing in-use page 2991 (pseudophys 100a4): count=2 type=e8000000

Ian mentioned that these could be harmless. Is this because dom0 has
mapped the page for suspending? And if so, why is the count two not one?

Should xc_core.c really be using MMAPBATCH? I suppose it's convenient
but it does mean that the frame lists have to be locked in memory.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.