[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Regarding Outreachy project on Improving CR Dashboard
On Tue, 2016-04-05 at 22:05 +0530, Priya wrote: > Hello all, > > I have completed coding the initial task of grouping the email thread > using the Zawinski algorithms and then adding property entity to the > json for the messages that belong to the same email thread. > > You can see my git repo [1]. The new.json is the output of my script > and out.json is the output of Perceval. > > Also, I have updated the README.md file regarding the execution > procedures in github. > > Instructions > ============ > > git clone https://github.com/priya299/Dashboard.git > > cd Dashboard > > python createjson.py 'Perceval Ouputfile' 'mbox file' 'output_file' > > eg: python createjson.py out.json xen-devel-2016-03 new.json > > "new.json" json file will be created with each message belong to a > single thread having an additional attribute "property". The property > attribute will have message id of the first message in the thread. > > Now, I will be pushing the new.json into the elastic search db[2]. > Please give me your valuable feedback about my progress. > > [1]:https://github.com/priya299/Dashboard > [2]:https://www.elastic.co/guide/en/kibana/3.0/import-some-data.html Hi, Priya. To begin with, could you please integrate your code with the Perceval iterator? In other words, you can run Perceval on the mailing list archive directly from your code, which will render the use of "out.json" void. That way, the invocation of the script would be more like: python createjson.py xen-devel-2016-03 new.json In other words, create.json would use Perceval to parse the mailing list archive. For this end, the Perceval mbox backend is a class, which once instantiated, provides an iterator function, fetch(), that you can run inside a loop. For each iteration of the loop, you get the equivalent to a JSON element in out.json. The code would be similar to: ------------------------------- import perceval mbox_parser = perceval.backends.mbox.MBox( origin=mbox_url, dirpath=mbox_file_name ) for item in mbox_parser.fetch(): thread_id = find_thread(item) ... --------------------------------- Some details about the Perceval mbox class: http://perceval.readthedocs.org/en/master/perceval.backends.html#module -perceval.backends.mbox If you have trouble running the Perceval backend as an iterator, please let me know. In addition, you can use argparse for reading the arguments in the command line. It is easy and convenient. Saludos, Jesus. > -- Bitergia: http://bitergia.com /me at Twitter: https://twitter.com/jgbarah _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |