[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MirageOS-devel] Mirage tracing

On 9 October 2014 13:31, Len Maxwell <len@xxxxxxxx> wrote:
> Hello,
> This looks like a good start.  I've created similar specialized tools for
> post-processing and visualization in the past, and they are usually
> worth the investment.
> On Tue, Oct 7, 2014 at 11:09 AM, Thomas Leonard <talex5@xxxxxxxxx> wrote:
>> By default, threads just appear as numbers, but I added some code to
>> my unikernel to label some them (I didn't annotate any of the Mirage
>> libraries, though). For example, the "B.read" thread is what the
>> unikernel gets back when it calls B.read on the block device, but you
>> can see that behind this mirage-block-xen is creating other threads.
> Looking at your example, threads 930 and 945 are ostensibly created
> from the same line of code, during separate iterations through the
> read loop.
> Would it be practical to capture the caller's PC during thread creation,
> and dump that out with the Lwt.current_id?  With that you could label
> 930 and 945 in a way that makes their common caller more obvious,
> ex. 930-0x1234 and 945-0x1234 vs 929-0x4567.
> (Or better yet, translate PC to source/line info during post-processing?)
> In other words, some amount of automated labeling might be a good
> complement to manual labeling.

The PC will be somewhere in Lwt, and inlining will likely make walking
the stack unreliable. But something like the camlp4 annotations Anil
mentioned would probably help.

> Once you can recognize common patterns in thread creation, you can
> present them more clearly in the visualization.  The resulting data might
> also be amenable to flame graph visualization:

One problem with using flamegraphs here is that threads run in
parallel (and may outlive their parents, though you can't see it in
this example image). e.g. a thread that spawns 10 parallel sleep
subthreads and then joins on them - how would you account for that?

> http://www.brendangregg.com/flamegraphs.html
> http://www.brendangregg.com/Slides/LISA13_Flame_Graphs.pdf
> Out of curiosity, what happened to thread 934?  Is that a failure, or
> an unread result that gets garbage collected?

I updated the picture with a new trace, which includes more labels and
lays out "tail-recursive" calls (where a thread merges with a thread
it previously created) horizontally:

( updated trace file: http://test.roscidus.com/static/log-x86.sexp )

Unfortunately, I overwrote the originals so I'm not sure exactly which
threads you're referring to. The new layout hopefully makes it clear
that there are two main things going on here:

- the test unikernel's loop reading from the the block device ("Process data")
- the blkfront polling loop, processing responses from dom0 ("blkfront.poll")

I suspect there are some missing arrows. e.g. 425 gets notified by
blkfront and then the Process data thread suddenly resolves 426 for no
obvious reason. Probably just a missing notification somewhere.

Also, I need to add an indication of when we send a request on an
event channel. Here, we only see the replies.

Dr Thomas Leonard        http://0install.net/
GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

MirageOS-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.