[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Idea for future xen development: PV file system


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: George Shuklin <george.shuklin@xxxxxxxxx>
  • Date: Tue, 31 Aug 2010 15:35:43 +0400
  • Cc: xen-users@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 31 Aug 2010 04:38:52 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:content-type:date:message-id:mime-version :x-mailer:content-transfer-encoding; b=AYe40r7Eu4PkYGcIjE3C+2vH8+bkm9y+/JIkfJAJE4B4Q75KsmmkjxkV0sofl5gBsp hMcZ/C9pXcD2vrfMhMdB45ELaYkPjlsFXKGfsobtTce4r4VGbfPUuoVoyrwkNePhcZJL IH1Ike+0zhse51+FqdJRW1zhQ4GiOXEbFIBt8=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Good day.

Few days I have discussion with colleagues about file-based space
provisioning for xen VM's. Obvious solution is 'pure NFS' VM (with root
file system on NFS server). But NFS have two serious disadvantages:
1) NFS required network level. Even network is fast, you need to send
and receive packets for every file operation. That's create a huge
difference in performance between local FS on network block device and
NFS.
2) NFS is not very secure. Yes, there is  NFS4 with kerberos, but it
changed model of user rights, so NFS-server administrator must be aware
about every user of every VM. It's opposite to idea of 'here your
directory do you want to do in NFS3'. 
(I can say third: you need network in VM to works with NFS).

So, here two main problems: network overhead and bad security. ... But
if we can  solve them with nice and fast Xen mechanisms of IDC (inter
domain communications)?

All we need is create a daemon for dom0 to export directories from 'own'
local filesystem (it can resides on local disk, iscsi or be gfs, any
kind of hyper-mega-cluster-fs) and simple PV driver for domU translating
requests from domU to daemon in dom0. It will be like xen block device:
minimalistic and simple. It will offload all real work to well-tested
product filesystems in dom0 (it can be even not a dom0, but any other
stubdomain).

All need do a 'xen pvfs' driver is translate every request to fs via
shared mem, xenstore and xen-events to pvfs daemon.

Work of daemon in dom0 (or stubdomain) will be slightly large. It will
need to:
1) check access rights (may be in form like NFS with domID instead IP's)
2) translate requests to underlying filesystem.
3) translate f/s replies (include errors and so on).

I see forth job: maintain disk quote for every dom (because access to
filesystem will occur (may be) from root, and root will be differ
between domUs, we need do some more checking at daemon level.

Now about advantages of this system:

1) We will gain file-level access. That's means easy access to
filesystem of domU without mount/umount process.
2) We will able to use single volume for few virtual machines,
increasing server consolidation.
3) Currently VM is have certain troubles with sudden block device size
changes. We can resize filesystems, but even LVM not ready for PV size
changes. And, if we talking about space reduction, every FS do it
slowly, so it can not be normal process like xenballooning. Using single
FS we will able to use single space for every VM (and quotes will helps
us to keep every VM within limits).
4) Any deduplication capabilities will work more efficiently with file
deduplication instead block-level deduplication.
5) ... and we can start thinging about openvz-like system with container
templates (where duplicated files on filesystem are simply links to
original).


Now about sad thing. I'd like to say 'I will write it', but my
programming level is far beyond required, so I simply asking someone to
help do it. Or, at least, say opinion about this feature.


---
wBR, George.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.