[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Frequent rpm db corruption / lock-up


  • To: Xen-Users <xen-users@xxxxxxxxxxxxxxxxxxx>
  • From: "Gino LV. Ledesma" <gledesma@xxxxxxxxx>
  • Date: Thu, 22 Sep 2005 15:08:05 -0700
  • Delivery-date: Thu, 22 Sep 2005 22:05:49 +0000
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=CcWJp8ClKbIn4IES2SMogpN+wxfdPHedxD29T41FKJNdTKC9QQa6QIgmxC5QuhfeKDP+P3gjaG5YjpG8bDr+bupzsX1/GtMQCe0+mp39CAJwBAr0qYmjlF9VimFGu3HqOKhJhTx8AMRIUUGvDs1XwV3EWpIvfAjP6h5CEpsFEm8=
  • List-id: Xen user discussion <xen-users.lists.xensource.com>

Thanks for the reply.

This is exactly what I have been experiencing, which is very odd. I
remember seeing these back in the 2.4->2.6 migration and I do know it
has something to do with NPTL.

Only the domUs are affected (all running CentOS 4.1 with latest
updates applied).

In a test of doing "yum search foo" 10x, the problem can show itself
in the 4th or 5th run. Worse if there are two processes trying to
access the rpm db -- e.g. someone doing rpm -qf <foo> and another
doing rpm -qa.

- gino

On 9/22/05, Ted Kaczmarek <tedkaz@xxxxxxxxxxxxx> wrote:
> On Thu, 2005-09-22 at 11:26 -0700, Gino LV. Ledesma wrote:
> > Hi, all
> >
> > After deploying several xen hosts for various purposes (staging,
> > production, development, qa, etc), I've been seeing strange problems
> > involving rpm and yum. Two of the most frequently occuring problems
> > are:
> >
> > 1. Indefinite lock-up / "hang" when using yum or rpm -- doing an
> > strace shows that the process is waiting for a futex to complete, and
> > this will take forever (not sure if it actually completes).
> >
> > 2. rpmdb corruption -- rpm / yum complains about incorrect db version,
> > corrupted index, or whatnot.
> >
> > In both cases, deleting /var/lib/rpm/__db* and recreating the db via
> > rpm --rebuilddb (optional) fixes it. But the problem will recur again
> > later. One fix I've found is to force it to use LD_ASSUME_KERNEL=x
> > where x is a really old kernel (pre-2.4.6).
> >
> > Anyone else observed this problem? I'm not sure if /lib/tls is to be
> > blamed (I left it in place because I need db4 and a lot of ther
> > things). Using xen-2.0.7 and noticed this with 2.0.6 as well.
> >
> > As always, thanks for the help in advance. :-)
> >
> > - gino
>
> Have a Centos 4.1 that was running 2.0.7 with Centos 4.1 and FC4 domU's
> and did not see any such issues. It is running 2.0 testing right now.
> Just ripped off yum updates on a Centos 4.1 and FC4 vm with no problems.
>
> That rpm issue should be very sporatic with newer version of rpm, I used
> to see it quite a bit in early nptl older rpm days, but very rare these
> days.
>
> Rarely had to rebuild the db to fix it, generally just removing the rpm
> db files resolved it.  This was RH8 through FCX.
>
> kill -9 "pid of rpm"
> rm -rf /var/lib/rpm__db*
>
> Regards,
> Ted
>
>

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.