[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Provide support for multiple frame buffers in Xen.


  • To: "Tim (Xen.org)" <tim@xxxxxxx>
  • From: Robert Phillips <robert.phillips@xxxxxxxxxx>
  • Date: Mon, 12 Nov 2012 16:31:58 -0500
  • Accept-language: en-US
  • Acceptlanguage: en-US
  • Cc: "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>
  • Delivery-date: Mon, 12 Nov 2012 21:32:36 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: Ac29tISgskj15PbzRniL91JPgz0BOQDZuVjA
  • Thread-topic: [Xen-devel] [PATCH] Provide support for multiple frame buffers in Xen.

Hi,

I'm sending version 3 of the patch shortly.  I think it addresses all your 
concerns.

> -----Original Message-----
> From: Tim Deegan [mailto:tim@xxxxxxx]
> Sent: Thursday, November 08, 2012 8:25 AM
> To: Robert Phillips
> Cc: xen-devel@xxxxxxxxxxxxx
> Subject: Re: [Xen-devel] [PATCH] Provide support for multiple frame buffers
> in Xen.
> 
> Hi,
> 
> Thanks for the update patch.
> 
> At 15:36 -0500 on 07 Nov (1352302577), Robert Phillips wrote:
> > > More substantive comments:
> > >  - I think the dirty_vram.c and dirty_vram.h files belong under mm/
> > >    rather than under hvm/.  The ``#include "../mm/mm-locks.h"'' is
> > >    an indicator that this is really MM code.
> >
> > I have moved dirty_vram.c under mm/
> > I have moved dirty_vram.h under include/asm-x86 .  In that location it
> > is available to modules like hvm.c
> 
> Sure, that seems sgood.
> 
> > > I'd be
> > > inclined to just put all paddr_links in the linked list (and have an
> > > array of pointers rather than an array of paddr_link_ts).  Is it worth
> > > having the extra complexity here, and at the callers, to avoid a single
> > > memory read?
> >
> > Almost all frame buffer pages have a single mapping so the algorithm's
> > common case is met by constructing pl_tab as an array of paddr_links.
> > That is, the table's paddr_links will rarely point to a chain of
> > extension paddr_links.  The code encapsulates the complexity in a
> > single function.  Sorry it required so much staring.  Even if all
> > paddr_links were stored in a linked list, the caller would have to
> > walk that linked list, so would be just as complex.
> 
> OK.
> 
> > > > +        if (set)
> > > > +        {
> > > > +            /* Did we find sl1ma in either the main table or the 
> > > > linked list? */
> > > > +            if (pl == NULL) /* no, so we'll need to alloc a link */
> > > > +            {
> > > > +                ASSERT(ppl != NULL);
> > > > +                /* alloc link and append it to list */
> > > > +                (*ppl) = pl = alloc_paddr_link(d);
> > > > +                if (pl == NULL)
> > > > +                    goto out;
> > >
> > > This needs to signal some sort of error.  Otherwise, if we can't add
> > > this sl1e to the list we'll just silently fail to track it.
> >
> > I don't know what sort of error we can signal.
> >
> > The immediate symptom would be that areas of the monitor would not be
> > refreshed.  But, since we're running out of memory, that might be the
> > least of the quests's woes.  The updated patch actually makes the
> > symptom more likely (though still very unlikely) by putting an
> > arbitrary bound on the length of paddr_link chains.  It handles a
> > rogue process, one that has an arbitrarily large number of mappings
> > for frame buffer pages, by simply not recording the excessive
> > mappings.  If that results in unrefreshed blocks on the monitor, so be
> > it.
> 
> I think the correct behaviour would be to report these pages as dirty.
> That way anything that relies on seeing all changes will behave
> correctly, though less efficiently.

Ok, I've associated a boolean called "stuck_dirty" with each frame buffer page.
If for any reason we're unable to generated a complete set of mappings,
the bit gets set and that frame buffer page is considered dirty forever more,
or until the range gets torn down.

I was unable to make a failure happen (not surprisingly) so I added some
fault injection code for testing.  It's currently disabled/compiled-out.

> 
> > > > +#endif
> > > > +                free_paddr_link(d, ppl, pl);
> > > > +                if ( --range->nr_mappings == 0 )
> > > > +                {
> > > > +                    dirty_vram_range_free(d, range);
> > >
> > > What's this for?  If the guest unmaps the framebuffer and remaps it (or
> > > if the shadow PTs of the mappings are temporarily discarded) this will
> > > stop us from tracking the new mappings until the toolstack asks for the
> > > bitmap (and then it will be expensive to go and find the mappings).
> > >
> >
> > I don't see this happening.  If the guest unmaps the framebuffer, the
> > shadow code lazily recovers the shadow pages, so tracking will
> > continue until it decides a page is no longer a shadow page.  That is
> > when this code is invoked.
> 
> If the guest unmaps the buffer by clearing the PTEs then when the PTE
> changes are propagated into the shadows this will cause nr_mappings to
> fall to zero here.  If we _don't_ free the range, then when the FB is
> mapped again we'll spot the new PTEs as they're shadowed and DTRT.  If we
> _do_ free the range we end up walking all shadows looking for the
> mappings.
> 

Ok, I've removed the teardown code.  Ranges are no longer torn down
just because they have no mappings.  But they can still be torn down
if another mapping is created, one that overlaps, or if dirty vram 
bookkeeping is stopped altogether, which happens as a side-effect
of requesting a range with zero pages.

> > It tears down the mappings to that page and if some range ends up
> > with no mappings then the range is useless.  Vram ranges are generated
> > willy-nilly as needed.  This is the only mechanism for cleaning them up.
> 
> Shouldn't there be some way for the tools to indicate that they're done
> with a range?  I guess they can tear down the whole lot and then start
> again with whatever ranges are still in use.

Right.  Like little children, they do not clean up after themselves.

> 
> > > Oughtn't we to return all 1s in the bitmap here?  If the range isn't
> > > currently being tracked we should conservatively assume it's all dirty,
> > > right?
> >
> > The callers will should ensure that the range exists.  This is just a
> > conservative test.
> 
> OK.
> 
> > Anyway, it begs the question of whether a new range should be
> > considered all dirty or all clean.  Intuitively "all dirty" seems the
> > right answer but in practice it works fine as written since the guest
> > is busy updating the whole frame buffer.
> 
> What if the guest's not writing to the framebuffer at the moment, or
> only to a part of it?  Shouldn't the tools see the existing content?

Ok, in HAP a new range is considered to be "all dirty".

> 
> > > And, since you know you're in HAP mode, not in log-dirty mode, and
> > > already have the paging lock, you can just set
> > > d->arch.paging.mode |= PG_log_dirty there rather than jumping through
> > > the paging_log_dirty_enable() path and messing with locks.
> >
> > No, paging_log_dirty_enable() goes through several layers of functions and
> > ends up calling hap_enable_vram_tracking(), which does quite a bit of
> stuff.
> 
> OK.
> 
> > > > +/* paging_mark_dirty_hap()
> > > > + * Make a hap page writeable and mark it as dirty.
> > > > + * This done atomically under the p2m and paging locks to avoid
> leaving
> > > > + * a window where the page might be modified without being marked
> as
> > > dirty.
> > > > + */
> > >
> > > I'm perplexed by this -- AFAICT it's either not necessary (because all
> > > log-dirty read/clean ops are done with the domain paused) or not
> sufficient
> > > (because although the bitmap and the PTE are updated under the p2m
> lock,
> > > the actual dirtying of the page happens at some other time).  Can you
> > > spell out for me exactly what this is protecting against?
> >
> > The comment over-stated the problem so I've toned it down
> > from "without being marked" to "without being counted".
> >
> > paging_mark_dirty_hap() is in the page fault path and has two steps:
> >
> > (1) It calls p2m_change_type() to re-mark some pfn as writeable
> >     (i.e. p2m_ram_rw), which is a no-op if the pfn is already writeable.
> >     This must be done under the p2m_lock.
> >
> > (2) If the pfn was previously read-only (i.e. p2m_ram_logdirty) then
> >     it bumps two dirty counts.  And it marks the page as dirty.  This must
> >     be done under the paging lock.
> >
> > As an invariant, the dirty counts should be precisely the number of pages
> > made writeable.
> 
> That invariant is already doomed -- any other agent (including the guest
> itself) can change the p2m mappings for one of the pfns, or mark the pfn
> dirty, without going through the NPF fault handler.  AFAICS the counts
> could just as well be implemented as flags, to say 'something in this
> range was dirtied'; the only interesting case is when they're 0.
> 
> > It may be that this invariant is not particularly useful, that things
> > will just work out.  But I do wonder if, without it, we'll have
> > situations where the dirty counts will indicate dirty pages when there
> > are none, or no dirty pages when there are some.
> 
> The second of those cases would be the bad one; I think that can be
> avoided by just switching the order of paging_mark_dirty() and
> p2m_change_type() in hvm_hap_nested_page_fault(), and maybe putting a
> wmb() between them.

Ok, I've done as you suggested.  I don't think the wmb() is needed
but I could be convinced.

> 
> Cheers,
> 
> Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.