[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] Error restoring DomU when using GPLPV
The problem is that every page that is ballooned down by the balloon driver can be slurped up as a private- persistent ("preswap") page by tmem. Private-persistent pages contain indirectly-accessible domain data, are counted against the domain's tot_pages, and are migrated along with the domain-directly-accessible pages. So any temporary mapping of xenheap pages into domheap, such as occurs during restore/migration, can cause max_pages to be exceeded. This isn't a problem today for tmem because tmem only runs in PV domains today, but I suspect the fragileness of this approach will come back and bite us. It reminds me of the classic "shell game". Is there a per-domain counter of these special pages somewhere? If so, a MEMF flag could subtract this from max_pages in the limit check in assign_pages(), e.g.: max = d->max_pages; if ( memflags & MEMF_no_special ) max -= d->special_pages; <snip> if ( unlikely((d->tot_pages + ... > max ) /* Over-allocation */ (Special_pages counts any xenheap pages that contain domain-specific data that needs to be retained across a migration.) Dan > -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > Sent: Thursday, September 17, 2009 12:21 AM > To: Mukesh Rathor; Dan Magenheimer > Cc: Annie Li; Joshua West; James Harper; xen-devel; Wayne Gong; Kurt > Hackel > Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > > > Yeah, all the PV drivers are having to do is balloon down one > page for every > Xenheap page they map. There's no further complexity than > that, so let's not > make a mountain out of a molehill. The approach as discussed and now > implemented should work fine with tmem I think. > > -- Keir > > On 16/09/2009 21:50, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote: > > > just in case someone missed the thread earlier, > > > > 3 = 1 shinfo + 2 gnt frames default. > > > > so, tot_pages + shinfo + num gnt frames. > > > > > > Mukesh > > > > > > > > Dan Magenheimer wrote: > >> Before we close down this thread, I have a concern: > >> > >> According to Mukesh, the fix to this bug is dependent > >> on the pv drivers tracking tot_pages for a domain > >> and ballooning to ensure tot_pages+3 does not exceed > >> max_pages for the domain. > >> > >> Well, tmem can affect tot_pages for a domain inside > >> the hypervisor without any notification to pv drivers > >> or the balloon driver. And I'd imagine that PoD and > >> future memory optimization mechanisms such as > >> swapping and page-sharing may do the same. > >> > >> So this solution seems very fragile. > >> > >> Dan > >> > >>> -----Original Message----- > >>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] > >>> Sent: Wednesday, September 16, 2009 6:28 AM > >>> To: Annie Li > >>> Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel; > >>> James Harper; > >>> Wayne Gong > >>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > >>> > >>> > >>> On 16/09/2009 12:10, "ANNIE LI" <annie.li@xxxxxxxxxx> wrote: > >>> > >>>>> I will do more test to make sure it and update here. > >>>> I tried to map 256 grant frames during initialization and > >>> balloon down > >>>> 256+1(shinfo+gnttab) pages driver first > >>>> load. Then i did save/restore for 50 times, and live > >>> migration for 10 > >>>> times. No error occurs. > >>> Okay, well I still can't explain why that fixes it, but > >>> clearly it does. So > >>> that's good. :-) > >>> > >>> -- Keir > >>> > >>> > >>> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@xxxxxxxxxxxxxxxxxxx > >> http://lists.xensource.com/xen-devel > > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |