[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Mirage/kFreeBSD on GitHub



On 4 Aug 2012, at 07:02, PALI Gabor Janos wrote:

> And that is where perf/alloc and perf/gcperf grind the machine to a halt: 
> after
> the first few size tests, the system becomes unresponsive and one can only 
> switch
> between the virtual terminals, but nothing more.
> 
> I am also asking Robert: do you have any ideas on how to debug this?

Hi Gabor:

Sounds like great progress!

On debugging: is this a multi-CPU system or a single-CPU system? Are you able 
to break into the debugger using the break sequence?

Being able switch VTYs means that interrupt delivery is enabled, and that the 
kernel thread scheduler is working. Two scenarios come to mind:

1. This is a single-CPU system and something in the kernel (OCaml?) is running 
in a loop, preventing userspace from running. If you use a multi-CPU system you 
might find userspace is able to run on another CPU, and a tool like top (using 
-SH or similar) spots the problem.

2. Something in the kernel is holding a lock and not releasing it, due to 
deadlock, a lock leak, etc, and other parts (such as userspace) have ended up 
stacked up behind it.

This is best  debugged using the kernel debugger, especially if there's a 
locking issue. Break into the debugger, if possible (likely, as you can switch 
VTYs) and use WITNESS's "show alllocks" to see if there's an obvious candidate 
for 2.

It occurs to me that there is synchronisation around the sysctl syscall in 
order to stabilise the sysctl tree while a sysctl is in use. There are a number 
of long-running sysctls (e.g., those used in bgfsck) and so I wouldn't have 
thought that there were any notable locks that could stack up behind it. 
However, it would be something to check.

Also, if you're kicking off your code with sysctl, is there some mutual 
exclusion or other mechanism to deal with the possibility that more than once 
instance of the sysctl might get started at once? (e.g., return EBUSY in thread 
two if thread one is already running the OCaml code).

Robert


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.