Thursday, May 26, 2011

Document automata .. How to export my *.doc, *odp, *.odt (and so on) files to other formats (*.pdf included) directly from the Terminal.

Hello there! In this little tutorial I want to introduce this UNIX program called UNOCONV:

Here is the official description of the file:


Unoconv converts between any document format that OpenOffice understands. It uses OpenOffice's UNO bindings for non-interactive conversion of documents.

Supported document formats include Open Document Format (.odt), MS Word (.doc), MS Office Open/MS OOXML (.xml), Portable Document Format (.pdf), HTML, XHTML, RTF, Docbook (.xml), and more.

And here is the reference of the program from the Linux Man Page: http://linux.die.net/man/1/unoconv

In this example, we are going to convert a bunch of files of *.ppt, *.doc, *.odt that are inside the same folder to *.pdf (now imagine the human time required to open every file and export them manually to PDF, but with UNOCONV that is going to be done in 1 line code).

Example files inside the same folder


1. Download the program:
In Debian based distros (like Ubuntu) type this in the terminal:


$  sudo apt-get install unoconv


2. Open a Terminal (or it can be the same that we used before to install the program) and go to the directory of the files that we want to change the format (Remember that is possible to take a look for what is inside a folder from the Terminal by typing the UNIX command "ls").

3. Now in the directory, we can use combination of parameters to select the best that fits our problem.


In the particular case of this example, I want to export every file inside the folder to *.pdf here with this code:


$ unoconv -f pdf *.*


Example


Explanation of the parameters:

  1. unoconv = name of the program
  2. -f = output files, I choose "pdf" because I want that all the files inside the folder (*.doc, *.ppt, etc) be exported to this pdf format.
  3. *.* = means that every file with any name (because the asterisk is before the dot) and every file with any extension (the asterisk after the dot) will be used as input by the unoconv program.
4. Finally, we execute the program by pressing enter and see the results inside the folder.



And the output from the Terminal with the UNIX command "ls" :


As you can see is very easy to use and very helpful when we need to export thousands of files from one editable format to PDF and obviously, we do not want to do it manually.


This program also helps you if you use OpenOffice and you want to convert every MS Office to a format that best fits us dear Linux users.

Hope this short Tutorial helps someone!

Benjamin Tovar