[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Ceph + RBD + Xen: Complete collapse -> Network issue in domU / Bad data for OSD / OOM Kill



Hi,


> I would suggest trying kvm instead of xen...

There is a lot of different setup that might work, I'm not looking for
a different setup, I want to fix this one.
We already have a complete running infra using Xen and using an iscsi
NAS as backend for the VM image and Lustre as distributed FS and I
want to replace all that and use ceph instead.


>      "net eth0: rx->offset: 0, size: 4294967295"
>
> Seems that there is something wrong with networking code in xen?

The problem doesn't appear with Ceph on its own or Xen on its own, so
it's a weird interaction between them and it could very well be that
the network code of ceph does something unexpected triggering this
behavior in xen. At this point I wouldn't exclude anything ...

Also even if the root fault lies with Xen, I think that the running
away memory issue triggered inside the OSD by this is worth fixing on
its own. A badly behaving client shouldn't be able to DoS the OSD so
easily.


Cheers,

    Sylvain

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.