[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] [linux-2.6.18-xen] transcendent memory ("tmem")linux-side changes



> A few general remarks:

Excellent!  Thanks very much for the feedback!

A few comments and explanations below.

Your comments and feedback will be especially useful
when proposing the patch upstream.  After reading
my replies, please let me know if there is anything
you see as a *must-fix* for the patch to be put into
the (soon-to-be-obsolescent) 2.6.18-xen tree, where
its primary goal is to allow tmem to be more easily
used by xen developers and leading-edge xen users.
 
> >+    unsigned long *preswap_map;
> >+    unsigned int preswap_pages;
> 
> Missing #ifdef CONFIG_PRESWAP?

I had one there in an earlier draft but it causes more
ifdef's elsewhere.  Since swap_info_struct is so rarely
instantiated (one per swap device), I thought maybe
avoiding additional ifdef's might be more attractive
than saving a few bytes system-wide.  Thoughts?

> >+static inline void *preswap_malloc(unsigned long size)
> >+{
> >+    return vmalloc(size);
> 
> I would view this as problematic on 32-bit, as the vmalloc 
> space is rather
> restricted there. Not because the allocation here may fail, 
> but because it
> may cause failures elsewhere. Some feedback on the amount of vmalloc
> space used and/or still available is going to be needed here 
> I'd think.

The swapmap requires one byte vmalloc'ed for every page
in every swap device.  Preswap_map requires one additional
bit to be vmalloc'ed for every swapmap byte.  Is this
likely to be a problem then?  By "some feedback", do
you mean I should add a comment?

> >+extern struct swap_list_t swap_list;
> > extern spinlock_t swap_lock;
> 
> That's not good - swap_list being non-static in 2.6.18 is 
> just an oversight
> afaict, which is being fixed in later kernels. I think it 
> would be cleaner (also
> wrt. a possible pv-ops implementation) to move some or all of 
> the code in
> mm/preswap.c into mm/swapfile.c (would likewise eliminate the need to
> remove the 'static' from try_to_unuse()).

I noticed in recent kernel versions that swap_list has
seemed to bounce back and forth between static and non-static.
It isn't clear to me if this is for data-hiding or just because
someone noticed it wasn't being used in other source files.

I thought about adding list-walking macros, but it seemed
a bit extreme just to avoid extern-ifying a single static.

In any case, unless you object, I'd like to leave this
as is for 2.6.18-xen but I do expect some feedback and
possible changes when proposing it upstream.  (And for
upstream, I'd rather start with a cleaner diffstat, then
move chunks of code to swapfile.c in response to maintainer
feedback :-)

> >+config TMEM
> >+    bool "Transcendent memory support"
> >+    def_bool y
> >+    depends on XEN
> >+    help
> >+      In a virtualized environment, allows unused and underutilized
> >+      system physical memory to be made accessible through a narrow
> >+      well-defined page-copy-based API.  If unsure, say Y.
>
> 'bool' followed by 'def_bool'? And even more, an experimental 
> (as I would
> view it at this stage) feature that defaults to 'yes'?

I have to admit complete bafflement at Kconfig syntax.  It
seems I can never get it right.  (Is there a good README
someplace that I can learn from?)

It would certainly never default to 'yes' upstream, but I
thought it might be OK in 2.6.18-xen, especially since it
does essentially nothing if Xen doesn't have tmem (or if
it has it but it is not enabled).

If 'no' is best, what's the best Kconfig syntax for that?
(I tried changing "def_bool y" to "def_bool n" once and
it didn't make any difference!)
 
> Also, does have selecting TMEM on its own any meaning? If not (which I
> think is the case), then 'select'ing TMEM from the other two 
> options may
> be more appropriate (while TMEM then shouldn't have a prompt anymore).

I am anticipating other kernel uses for tmem, though what
you suggest makes sense (if I could figure out the Kconfig
syntax to do it! :-)

> >+++ b/mm/tmem.c      Tue Jun 09 15:37:03 2009 -0600
> This is Xen-only code sitting in an apparently generic source 
> ...
> While I partly understand your intention of potentially 
> sharing the implementation
> with other hypervisors, I don't think Xen interface 
> definitions belong here - you're
> not using include/xen/interface/tmem.h at all, while that's 
> where the definitions
> should come from imo. If you really want this to represent an 
> abstract interface,
> then ensuring consistency with the Xen interface would be a 
> minimal requirement
> (by e.g. including the Xen header in the #ifdef CONFIG_XEN 
> block). Otoh I'm
> getting the impression that the patch as a whole isn't 
> following this abstract
> interface goal, so it should probably be consistent here and 
> not try to do so in
> a just single place.

The concept of tmem is very generic and I foresee uses for
it in the native kernel.  I agree that the patch is kind
of halfway between xen-specific and abstract.  I hesitate
to completely xen-ify it but nor can I fully abstract it yet
unless/until there is another user for it.

> >+static inline int tmem_new_pool(u64 uuid_lo, u64 uuid_hi, u32 flags)
> >+{
> >+    flags |= (PAGE_SHIFT - 12) << TMEM_POOL_PAGESIZE_SHIFT;
> 
> This hard-coded literal 12 seems dubious: Is this part of the 
> tmem hypervisor
> interface? If so, there should be a #define in the interface 
> header, plus a
> (ideally build-time) check that the value really fits 
> TMEM_POOL_PAGESIZE_MASK.

Good point.  To explain, the tmem API supports other
pagesizes than 4K, but only pagesizes that are a multiple
of 4K.  I agree that I should have a mask (which is 4 bits,
allowing pagesizes up to 128MB) and a define for "12"
(TMEM_POOL_MIN_PAGESIZE_SHIFT).

Thanks again, Jan!

Dan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.