Xen project Mailing List

Thank you for openSUSE VM and the libreoffice-wiki-publisher but you must split the .odt file?

On Monday, April 3, 2017 8:38 AM, Lars Kurth <lars.kurth.xen@xxxxxxxxx> wrote:

> On 29 Mar 2017, at 05:51, Juergen Gross <jgross@xxxxxxxx> wrote:
>
> On 28/03/17 21:33, Mike Wright wrote:
>> On 03/28/2017 12:11 PM, Mohsen wrote:
>>> I'm using LibreOffice 4.3.3.2 on Debian amd64 and this version not
>>> have MediaWiki export function!!
>>
>> There was a deb for the libreoffice extension
>> libreoffice-wiki-publisher. Give that a try.
>
> In the end you need that only once for the initial conversion. So
> instead of trying to find the correct package you could just use
> your Xen skills to create an openSUSE VM and do it there. :-)

Juergen, thanks for the tip. I installed an openSUSE VM and the libreoffice-wiki-publisher came as default, which is good. So I ran a few tests.

First, what I couldn't get to work: I tried Send > MediaWiki in the hope that this would allow transferring of images, but could not get it to work. There seems to be an issue with authentication on the XenProject wiki side. But File > Export [MediaWiki (.txt)] works.

Also, I couldn't find any docs for the converter: the help links to pages which do not exist. But hey, we can live with that.

Here is what I learned:
=======================
* The export granularity is 1 LibreOffice document to 1 Wiki Page
* Most of the basic formatting such as lists and tables get correctly converted, but the converter introduces an awful lot of <span style"...">...</span> and <div style"...">...</div> attributes. Basically it does this every time, something slightly out of the ordinary has been done with text. These may have to be stripped with on-line tools such as http://www.unit-conversion.info/texttools/strip-tags/ or similar, otherwise the wiki pages become a nightmare to edit in future.
* URLs are correctly converted
* Headlines (aka text marked as "Heading 1", "Heading 2", etc. are not converted) to = ... =, == ... ==, etc.
* Images are not converted: when an image is found, "[[image:|top]]" is inserted
* I don't know how code snippets will come across in terms of formatting, as I don't have the ODT source of the book

What does this mean:
====================
In principle, this means that should be doable with 1-2 days worth of work. However it's not going to be entirely trivial. What we would need to do is to:
* Break the original book ODT file into smaller sections (probably along the chapter structure as exposed in the Contents)
* Then take each of the ODT files and do the following
** Save as MediaWiki (.txt) [1]. If appropriate remove tags using http://www.unit-conversion.info/texttools/strip-tags/
** Save as html [2] - to get the images. The bad news is that the images are saved using some hash names and not in the order they are in the document
** Create the wiki page from [1]
** Fix up bad formatting (such as missing = ... =, == ... ==, etc.)
** Manually upload the images from [2]
** Add appropriate [[Category:...]] tags at the bottom of each page. At least one for the wiki-book, e.g. [[Category:HelloXenProjectBook]] or something similar. But of course further categories per topic can be added as needed.

Once we have all the content, create the common pages such as contents, credits, etc. - and we should have a good starting point.

Best Regards
Lars

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

Re: [Xen-users] [Xen-devel] "Hello Xen Project" Book.