[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [outreachy] progress



[11:24] <vr34> Hi!
[11:24] <vr34> This is Vaishnavi, outreachy applicant
[11:25] <vr34> i found a python implementation for the jwz threading algo
[11:25] <jgbarah> Hi, Vaishnavi
[11:25] <vr34> so what this does is allots a message id for each of the threads, right?
[11:26] <jgbarah> That's good. Use it as an inspiration if that suits you. But you need to write your code...
[11:26] <vr34> Oh okay
[11:27] <jgbarah> However, since this is a microtask, no problem if you sttart with a version which uses this code
[11:27] <vr34> what i have done till now is written a script that parses cmd line args and parses and uploads mbox json docs to es
[11:27] <jgbarah> the main problem for using it *as such* is that very likely it is suboptimal, since it assumes you have access to all messages
[11:28] <vr34> now what's left is - use the threading algo to get message ids and add this to the json documents and then upload again, am i right?
[11:28] <jgbarah> which is our case is not real, since you have them in the database, and the idea would be to minimize traffic with it
[11:28] <jgbarah> Yes, that is
[11:28] <jgbarah> Wo, if you want, try to do it in two phases:
[11:29] <jgbarah> in one, you can use the coe you found. Forget about efficiency, and just make it work
[11:29] <jgbarah> In a second one, you can check if you can improve performance by using your own code.
[11:29] <jgbarah> The first one will tell about how you reuse code, which is important
[11:30] <jgbarah> The second one would tell about how you code the algorithm in a certain scenario
[11:30] <jgbarah> Both are important...
[11:30] <vr34> okay, got it!
[11:30] <jgbarah> To be transparent to other pursuing for this project, please send a message to the mailing list,
[11:30] <jgbarah> pointing to this implementation you found, and this conversation, please.
[11:30] <jgbarah> A log of it would be enough.
[11:30] <vr34> i had also sent you a mail with a link to my github repo
[11:31] <vr34> Yes sure, will do
[11:31] <jgbarah> Of course, the fact that you looked for, and found, that implmentation, will be credited to you
[11:31] <jgbarah> I saw it (the message) but still didn't look at the code. Thanks.
[11:32] <jgbarah> Are you stumbling on any blocker?
[11:32] <vr34> sure,  thanks, i'll mail you if i have any further queries
[11:33] <vr34> i haven't yet started implementing the algo.. will definitely let you know when i have issues.. thanks a lot!
[11:33] <jgbarah> Good. Thanks! Please, keep me updated.
[11:33] <vr34> Sure.


On Fri, Apr 14, 2017 at 11:06 AM, Vaishnavi Ramesh Jayaraman <vaishnavi.ur777@xxxxxxxxx> wrote:
Hi,

I have applied to Outreachy for the project - Xen Code Review Dashboard and based on Jesus' suggestions I have made an initial contribution(There are more changes to be made which I am still working on.) 


I have created a script that accepts the mbox link as a command line argument, parses it and uploads the JSON documents that are obtained as an output from Perceval to ElasticSearch. The results can be queried too.

I am currently working on annotating the threads with their message ids.

Also, below is the timeline of the work I plan to accomplish:-

Month 1 - The work during the first month would be centered on getting extensive information both from mailing lists and git repositories using Perceval, and then storing it in ElasticSearch.
Month 2 - During the second month, scripts would have to be ported to use ElasticSearch data instead of SQL.
Month 3- The task would be to improve the dashboard in Kibana. Various visualizations like pie charts, bar charts and histograms could be added to help understand the logs better.

Thanks
Vaishnavi



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.