[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Outreachy project - Xen Code Review Dashboard



Hi,

I submitted an application for this code review dashboard and
would love to keep working on the microtask once I get some
more info. :)

I also came up with a general idea of how the project might be
split up - any feedback on this would be welcome! I wrote:

"As said by Jesus, the big picture of this project will be porting
everything behind the current code review dashboard to use
Grimoire Lab tools, from the current state of using
MetricsGrimoire and custom scripts. I expect this would involve
Perceval for analyzing data, and Grimoire Elk may be useful in
further stages, or may be too general - this is something I would
wish to explore.
This project will also involve a migration from SQL to Elasticsearch
- because I believe the relevant data is mostly / all available in
places online, I am unsure whether this would need to be a direct
migration. However, looking at the current SQL setup would be
beneficial to understanding the desired format of the Elasticsearch
indexes.
I would love to dive into this project and have 3 main parts - getting
data into ES, turning it into dashboard displays, and then fine tuning
and perhaps augmenting the dashboard to improve its usefulness.
Getting data into ES may seem simple but I believe that once it
needs to be used for the dashboard, many realizations will pop up
- thus I’d like to leave maybe 2-3 weeks for that first step, 6-7 weeks
for the visualizations (which will include querying the data), and the
final 3 weeks for touch ups and improvements."

Does this sound like an accurate summary and reasonable timeline? 
And I am guessing that from Jesus's involvement with the threads
that Jesus would be the mentor, is that correct? :)

Thanks!

Heather


On Sun, Apr 9, 2017 at 9:50 PM, Heather Booker <heather.j.booker@xxxxxxxxx> wrote:
Hi Jesus,

While using the Elasticsearch python library
messages to an index, I would get a UnicodeEncodeError:
"'utf-8' codec can't encode character '\udca0' in position 767:
surrogates not allowed".

Investigating in Grimoire elk https://github.com/grim
oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that 
perhaps that tool instead uses Latin-1 encoding, but I found that
to then produce a serialization error (their custom error message:
"Unable to serialize %r (type: %s)"). I suppose this is because
now it's bytes; of course, converting back to string after encoding
just cycles back to the first error.

As somewhat of a Python newbie I don't really know how to tackle
this! My thought atm is to splice the offending character out
of the message. 

And to clarify, my understanding is that the final result of this task
is an index of Xen data, with two types: commits and messages.
Each commit document should contain its original information
from git, plus the name of the branch it was developed in. And
should only the mbox messages which appear to be associated
with a specific commit exist in the final index? Is there some
key information in messages that is supposed to indicate the
association of a given commit with a git branch? I would be
grateful if you could specify the end goal a little more. :D

Thanks so much!

Heather



On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <jgb@xxxxxxxxxxxx> wrote:
On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> Hi Jesus, 
>
> Thanks for your reply!
>
> So about the task, instructions say after analyzing mboxes with
> Perceval to
> "store the resulting raw index in ElasticSearch" - what does raw
> index mean?

In this context, I mean "storing the JSON documents produced by
Perceval in an ElasticSearch index, as such". ElasticSearch stores JSON
documents, so it is just uploading the output of Perceval to it.

> In terms of figuring out the elasticsearch structure, do I want an
> index
> (xen-devel mbox) with a type (message) and each object from the
> perceval
> output to be one document? Or should it be more fine-grained?

Exactly.

Saludos,

        Jesus.

> Cheers,
>
> Heather
>
> On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb@biter
> gia.com> wrote:
> > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > Hi!
> > >
> > > I'd love to work on the Code Review Dashboard project for this
> > round
> > > of Outreachy.
> >
> > Great!!
> >
> > > Are the steps outlined
> > > here http://markmail.org/message/7adkmords3imkswd still the first
> > > contribution you'd like to see?
> >
> > Yes.
> >
> > > So is this a project that has been worked on in previous rounds
> > of
> > > GSOC/Outreachy also?
> > > If so is there a place to find links to the previous participants
> > > blogs? :)
> >
> > No. We had one participation at some point, but couldn't even start
> > for
> > personal reasons. There are some people considering working on this
> > for
> > this next round of Outreachy, however. You'll see their messages in
> > this mailing list.
> >
> > > Should questions about how the specifications/completion of the
> > > microtask be addressed to
> > > IRC or this list? If IRC, which channel - #xen-opw or #metrics-
> > > grimoire? On that note, I'm 
> > > curious why #metrics-grimoire is the listed channel on the
> > project
> > > page - are main contributors
> > > involved in both projects? Or is it just because the Xen
> > dashboard
> > > doesn't have a channel?
> >
> > The code review is for the Xen project, but it is done with (I
> > mean,
> > the ssoftware used for it is) GrimoireLab, which for historical
> > reasons
> > uses the #metrics-grimoire channel. That's why it is likely that
> > you
> > find somebody from the project there.
> >
> > If you have questions, and find me around in IRC, please ping me.
> > If
> > I'm not available, please send an email message.
> >
> > Saludos,
> >
> >         Jesus.
> >
> > > Thanks!
> > >
> > > Heather
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxx
> > > https://lists.xen.org/xen-devel
> > --
> > Bitergia: http://bitergia.com
> > /me at Twitter: https://twitter.com/jgbarah
> >
> >
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> https://lists.xen.org/xen-devel
--
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.