Convert docx to mediawiki and preserve [[Image:]] - converters

Currently, I'm trying to move a docx to a mediawiki file and preserve the proper filenames in the [[Image:]] tags. For some reason, the proper image file gets swallowed (ie, normally it'd be media/image4.jpg, but instead it's just empty).
I've tried extracting the docx and looking at docx/word/_rels/document.xml.rels but I have no idea how to figure out what images are duplicated. I made a simple script to do some find/replace, but in one file I have 130 [[Image:]] tags and only 105 images.
As such, I would like to have the MediaWiki filter output the proper image name when doing this:
soffice --headless --convert-to txt:MediaWiki myfile.docx
I'm on ubuntu 14.10.
Is this possible?

This doesn't appear to be possible, but I have written a workaround found here that solves it. The long and short of it is that I convert the file and manage uploading / linking of images manually.

Related

convert .nl image file

I have a list of hundreds of hyperlinks that are to image files from my supplier. The problem is they have a .nl file extension. Here's an example:
http://www.netsuite.com/core/media/media.nl?id=66821&c=ACCT120207&h=bad4512e36320e5b2239
I need to use some sort of batch process to find all those image files and convert them to a .png or .jpg link (or batch download all the images then rename them)
Do you have any suggestions?
As you don't show an excerpt from your list of URLs, nor state your Operating System, it is rather hard to help you process the entire list.
However, for the one URL you show, you can retrieve the image and store it locally as "image.jpg" like this:
curl -L "https://system.netsuite.com/core/media/media.nl?id=101065&c=ACCT120207&h=ff667401c82a7dc4c2e1" > image.jpg

generated docx with opentbs converted by unoconv and libreoffice

For some reason I am expecting a strange behaviour.
When I am merging my docx template with opentbs, it works all fine and it looks correct in the generated docx.
But now I need to convert the docx into a pdf where I am using unoconv and libreoffice on mac OS X 10.11.
when I do this, all strings with multiple lines (which are displayed correctly in the docx) will be displayed as single line in the pdf.
Also if I open the generated docx with libreoffice, all multi line strings will be displayed as single line.
I figured out, that I can use ;strconv=no.
This will then do exactly the opposite. All multi line strings in the docx will be displayed as single line, but in libreoffice or converting to pdf with unoconv they are displayed correctly with multi lines.
anyone has a solution for this problem?

Bulk Download Background Images

Is there any way to bulk download background images from an image sequence?
Specifically, I'm looking to download the different image sequences from this website: lookbook.reebok.com
I found a solution to it. In terminal, write curl http://asdf.com/what/ever/image/img[00-99].gif -o img#1.gif And change it to your desired URL, number of images and format. If you write a number thats higher than what exists, terminal will create empty files in the format you requested.

Pictures and picture names to doc?

I have a Windows folder full of pictures. I want to copy them all into a MS Office document, but with the picture filename written above each picture. Is there an easyish way to do this?
Thanks!
The .docx format is simply a main XML wrapped up in a ZIP format with any ancillary required files such as images. It would be pretty simple to do what you need.
I would start by producing an example document, renaming it to .zip, and examining the files within.

How to convert a source code text file (e.g. asp php js) to jpg with syntax highlight using command line / bash routine?

I need to create images of the first page of some source code text files, like asp or php or js files for example.
I usually accomplish this by typing a command like
enscript --no-header --pages=1 "${input_file}" -o - | ps2pdf - "${temp_pdf_file}"
convert -quality 100 -density 150x150 -append "${temp_pdf_file}"[0] "${output_file}"
trash "${temp_pdf_file}"
This works nice for my needs, but it obviously outputs an image "as is" with no "eye-candy" features.
I was wondering if there's a way to add syntax highlighting too.
This might come handy to speed up the creation of presentations of developed works for example.
Pygments is a source highlighting library which has PNG, JPEG, GIF and BMP formatters. No intermediate steps:
pygmentize -o jquery.png jquery-1.7.1.js
Edit: adding source code image to the document means you are doing it wrong to begin with. I would suggest LaTeX, Markdown or similar for the whole document and source code document could be generated.
Another easy/lazy way would be to create an html document using pygmentize and copy-paste it to the document. Not professional, but better than raster image.
Here's how I do it on my Mac:
I open up the file with MacVIM. MacVIM supports syntax highlighting.
I print the file to a PDF. This gives me a paged document with highlighted syntax.
When I print, The program Preview opens up to display the file. I can Export it to a jpg, or whatever my hearts desire.
I don't have a Mac
This works with Windows too.
You have to get VIM although Notepad++ may also work. Any program editor will support syntax highlighting and allow you to print out with the highlighted syntax. So, pick what you like.
You have to get some sort of PDF producing print driver such as CutePDF.
Converting it to a jpg. I think Adobe Acrobat may be able to export a PDF into a JPG, or maybe the print driver can print to a JPG instead of a PDF. Or, you can send it to a friend who has a Mac.

Resources