[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 COLOPre 16/26] tools/libx{l, c}: add back channel to libxc

On Wed, 2015-07-01 at 12:01 +0100, Andrew Cooper wrote:
> On 01/07/15 11:42, Ian Campbell wrote:
> > On Wed, 2015-07-01 at 10:38 +0800, Yang Hongyang wrote:
> >> On 06/30/2015 06:10 PM, Ian Campbell wrote:
> >>> On Thu, 2015-06-25 at 14:25 +0800, Yang Hongyang wrote:
> >>>> We need to send secondary's dirty page pfns back to primary.
> >>> In v2 Ian asked (<21888.2988.774072.32946@xxxxxxxxxxxxxxxxxxxxxxxx>):
> >>>
> >>>          In the pdf
> >>>             
> >>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
> >>>          linked from the wiki page
> >>>             http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> >>>          it says that the secondary keeps a copy of the original contents 
> >>> of
> >>>          its dirty pages.  So I don't understand why you need to send the 
> >>> dirty
> >>>          bitmap to the primary.
> >>>
> >>> Which I don't see an answer for in my archive. Have I missed (or
> >>> misplaced) the answer?
> >> Sorry, seems that I misplaced the answer to:
> >> [PATCH v2 COLOPre 09/13] tools/libxl: Update libxl_save_msgs_gen.pl to 
> >> support 
> >> return data from xl to xc
> >>
> >>    > Thanks for this.  I would have some comments on the details, but first
> >>    > I want to properly understand your use case.  So while I'm the author
> >>    > and maintainer of this save helper, I won't review this in detail just
> >>    > yet.  I'm following the thread about what this is for...
> >>
> >>      We need to send secondary's dirty page pfn back to primary. Primary 
> >> will
> >>      then send pages that are both dirtied on primary/secondary to 
> >> secondary.
> >>      in this way the secondary's memory will be consistent with primary.
> >>
> >>      As we disscussed in [PATCH v2 COLOPre 04/13] tools/libxc: export 
> >> xc_bitops.h
> >>      If we move this operation to libxc layer, this patch could be dropped.
> > This doesn't seem to be a response to Ian's question which I quoted
> > above.
> >
> > The crux of the question is that the design contained in those links
> > does not appear to require a back channel, because it does not require a
> > dirty bitmap to go from secondary to primary. Asserting a need to do so
> > does not answer the question.
> It very definitely does require a dirty bitmap moving from the secondary
> to the primary.

The current implementation might work as you describe, but the
design/paper which Ian references suggests that the secondary keeps
copies of all the clean pages sent by the primary, which it can then use
on checkpoint to reestablish consistency. See the last paragraph of
section 4.2 which states:

        COLO solves the memory checkpointing issue by
        keeping a local copy of the previous checkpointâs mem-
        ory contents, and reverting locally modified memory
        pages to the previous checkpoint before applying the
        delta memory pages from the PVM. Therefore, only
        Dp is transmitted, saving CPU and network resources.
        For device state, COLO uses the device suspend/resume
        process that was introduced by live migration [16] to
        gracefully bring both the PVM and SVM to the ini-
        tial state, and rebuilds the machine state using active-

Now perhaps Ian and I have both misinterpreted that part of the paper or
perhaps there is some reason why the current implementation deviates
from the design described there but as it stands your contention that a
back channel is a fundamental requirement is unfounded as far as I can

So to restate the question: Why does the current design deviate from the
design in the paper, or does the paper not say what we think it says.

> However, the set difference B - A (lets call this C) is out-of-date on
> the secondary (with respect to the primary) and will not be sent by the
> primary, as it was not memory dirtied by the primary.

This is where we deviate. According to the paper there is no need to
resend because the secondary already has a non-dirty copy of any memory
which is dirty in B but not A.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.