[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b)



El 29/07/15 a les 11.03, Ian Campbell ha escrit:
> On Tue, 2015-07-28 at 15:47 -0400, Konrad Rzeszutek Wilk wrote:
>> Hey,
>>
>> I launch a bunch of guests at the same time or in parallel and 
>> the scripts end up timing out with:
> 
> Are you sure you have cleaned out all the old udev .rules files? If any of
> those are still present then you will get both sets competing to drive
> things and they will conflict and cause this sort of breakage.
> 
> Perhaps we should put back the hacks which nobble the udev case for another
> release? i.e. the thing which writes the path (but unconditional in
> xencommons) and the bit in the hotplug scripts which gates on it, but still
> remove the .rules files. That's only delaying the inevitable though, since
> upgrades to 4.7 will have the same issue.
> 
> Perhaps in the scripts themselves:
> 
> if [ -n "${UDEV_CALL}" ] ; then
>       error "called through udev, please remove stale udev rules files"
> fi
> 
> relying on the (stale) 4.5 rules file having the UDEV_CALL=1 in them.

Another option would be to install an empty xen-backend.rules for the
4.6 release, and then remove it for 4.7.

I've also been able to trigger this by using a similar loop. AFAICT the
hotplug scripts are running correctly, the problem seems to be that the
check_sharing function that's executed to check *every* loop device that
points to the same file is scanning xenstore in order to find if the
loop device is also used by another guest. When 20 guests are launched
in parallel, the CPU consumption in Dom0 is quite high because of all
the Qemu processes, and the xenstore daemon is basically starving to get
some CPU time.

IMHO, we should remove this checks and allow the users to shoot on their
feet if they want to, and in fact that's what I did on FreeBSD.

What I still don't understand is why this only triggers with 2ba368
applied. My best guess is that you still have a stale xen-backend.rules
file so you are actually calling the hotplug scripts twice, creating x2
loop devices for each guest, which of course also slows down things even
more.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.