Documentation
Docvert
Table of Contents
=================
* Requirements
* Install + Security
Requirements
=============================================================
* PHP 5 or later, with the "ZLIB" and "XSL" extensions.
Optional Libraries
-------------------------------------------------------------
* If you want to convert Microsoft Word documents then
you'll need to choose a Microsoft Word to OpenDocument
converter.
Your options are as follows (in order of preference)
Option 1) "PyODConverter via OOo Server" depends on
* OpenOffice.org 2.3 or later (NOTE version. 2.3)
* python-uno (Python UNO libraries)
* "openoffice.org-headless"
Option 2) "OpenOffice.org Stand-alone" depends on,
* OpenOffice.org 1.9.122 or later (NOTE version. 1.9+)
* Xvfb (on Linux/BSD/OSX)
Option 3)
* Abiword 2.4.2 or later.
* If you want to convert bitmap images like BMP, JPG,
PNG, GIF you'll need PHP GD ("php5-gd").
* If your documents files have diagrams or graphs in WMF
or EMF (windows metafile format) and you want to convert
them to PNG/JPEG/SVG then you'll need to install
"libwmf".
Windows:
http://gnuwin32.sourceforge.net/packages/libwmf.htm
and grab "Complete package, except sources"
Linux:
your distribution's packaging, and find
libwmf0.2-7 or later,
libwmf-bin
http://wvware.sourceforge.net/libwmf.html
* If you want to validate documents against a schema
you'll need
Linux:
Trang from your distro's package manager or
<http://thaiopensource.com/relaxng/trang.html>
Windows:
...I don't know. If you could suggest a good
schema validator with command line support
I'll support it.
* If your documents have SVG images and you want to
convert them to PNG/JPEG/GIF, then you'll need to
install "svgrlib".
Windows:
http://www.gimp.org/~tml/gimp/win32/downloads.html
Linux:
your distribution's packaging, and find
librsvg2-2
librsvg-bin
http://librsvg.sourceforge.net/download/
* If you're using "Document Generation" you'll need PHP-Tidy
("php5-tidy").
Install + Security
=============================================================
Step 1. Host the Docvert software on a web server.
Step 2. Ensure that /writable is writable.
And,
Linux/OSX Users:
Make /etc/docvert and ensure that
/etc/docvert is writable to the web
server user (typically "www-data").
Windows Users:
Ensure that
/core/config/windows-specific/config/
is writable.
Ensure that this directory is not
browsable from the web. Verify that
it's not by browsing to this
directory in your web browser
Eg. ensure you can't browse to (for example)
http://localhost/docvert/core/config/windows-specific/config/
Now secure Docvert's admin page by opening up
Docvert in your web browser and browse to
the admin page (the tab on the top-right).
On the admin page, set the password.
Step 3. Do you want to convert Word Documents or just
OpenDocument files?
(If you just want to convert OpenDocument files then
skip this stage ... advance to Step 4)
Still reading?
Well you'll need to choose a 'Microsoft Word to OpenDocument'
converter. You have the choice of,
1) PyODConverter via OOo Server
2) OpenOffice.org Stand-alone
3) Abiword
Choose now! (I strongly recommend the first one)
1) PyODConverter via OOo Server
-------------------------------
This requires Python, python-uno, and
OpenOffice.org 2.3+ running in server mode
(which means it's listening on a port).
Docvert includes its own highly customised version of
PyODConverter so there's no need to download a copy.
For Linux/OSX users you'll need to allow
your web-server user to start/stop
OOo Server so go to the admin page and under
the "Run as User" heading enter the username
to run OpenOffice.org as.
Then ensure that the admin page's
"Super User Method" is "sudo",
and then you'll need to configure permissions
with the "visudo" command (from the command line)
to add some lines to sudoers,
Add this line,
www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/openoffice.org-server-init.sh
Replace NAMEOFUSER with... the name of the user but
leave the brackets () because sudoers wants them.
And replace www-data with the web server user if it's
something else.
For example, it might look like this,
www-data ALL=(docvert) NOPASSWD: /var/www/docvert/core/config/unix-specific/openoffice.org-server-init.sh
if you wanted to run OOo Server under
the system user "docvert".
2) Standalone OpenOffice.org (DEPRECATED)
-----------------------------------------
Stage i)
Windows:
In "core/config/windows-specific" edit
these files,
"convert-using-openoffice.org.bat"
In "core/config/unix-specific" edit
Linux/OSX:
these files,
"convert-using-openoffice.org.sh"
...to point at the OpenOffice.org executable
(typically "oowriter" or "oowriter2").
For Linux/OSX users you'll need to allow
your web-server user to run this script
so go to the admin page and under the
"Run as User" heading enter the username
to run OpenOffice.org as (by default it
will run as root), then you'll need to
configure permissions with the "visudo"
command to add some lines to sudoers,
Add this line (read below for which
bits to edit)
www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/convert-using-openoffice.org.sh
/var/www/vhosts[...].sh are the paths to your
scripts. In the case of Abiword, point to Abiword
instead.
Replace NAMEOFUSER with.. the name of the user but
leave the brackets () because sudoers wants them.
And replace www-data with the web server user if it's
something else.
Note: As a security measure it's a good idea to
ensure that the webserver user cannot edit this file
and change it to run any command with root privledges.
Do this by removing write permissions for all files
under the "core/config/unix-specific" directory.
Stage ii)
In OpenOffice.org we'll need to tell it to trust Docvert's
macros. We do this by adding "core/config/trusted-macros"
to the list of trusted macros within OpenOffice.org.
Open up Docvert in your web browser and browse to the Admin
page (there's a tab on the top right). If it's complaining
about the "/writable" directory not being writable then
make that directory writable, create an admin password.
Now get on the same computer as the web server.
If you've got your Web Server as a Windows Service
it may be supress OOo from appearing so go to
Windows Services and the properties window for your
web server program. Go to the "Log On" tab, ensuring
that "Allow service to interact with desktop" is
ticked and then restart the service.
(Later, if you want to supress OOo from appearing
during conversions, as you probably do, then untick
that checkbox and then restart the service.)
Now click the "Setup OpenOffice.org" button on the admin page
to start OpenOffice.org.
If there are problems here check Step 2 again, and
follow the instructions on screen which will try
and detect common configuration mistakes. Also try
the troubleshooting.txt guide.
You'll need to be able to start OpenOffice.org before
continuing.
Once you've got OpenOffice.org started go to TOOLS | OPTIONS
| SECURITY and "macro security". Add the directory
"core/config/trusted-macros" to the trusted list.
Exit OpenOffice.org and test that it's worked by clicking
on "Setup OpenOffice.org" again. Ensure there are no dialog
windows that appear when doing this.
If there are still dialog windows and you think you've followed
Stage ii correctly then read the troubleshooting.txt file, or
post on the Docvert mailing list.
3) Abiword
----------
Edit the core/config Abiword file for...
Linux/OSX:
In "core/config/unix-specific" edit
this file,
"convert-using-abiword.sh"
Windows:
In "core/config/windows-specific" edit
this file,
"convert-using-abiword.bat"
For Linux/OSX users you'll need to
configure permissions with the "visudo"
command to add some lines to sudoers,
Add this line
www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/convert-using-abiword.py
Replace NAMEOFUSER with.. the name of the user but
leave the brackets () because sudoers wants them.
And replace www-data with the web server user if it's
something else.
Step 4. That's it.
...well that's enough to convert, but there are additional
features you can set up too such as,
All of these involve editing the appropriate files under
"/core/config" to point at conversion libraries.
Schema Validation
-----------------
And if you want Schema Validation then edit,
Linux/OSX,
validate-against-schema.sh
Windows,
validate-against-schema.bat
...to point to Trang.
For Linux/OSX users you'll need to
configure permissions with the "visudo"
command to add some lines to sudoers,
Add this line
www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/validate-against-schema.sh
Image Conversion
----------------
Edit these...
Linux/OSX,
convert-using-wmf2gd.sh
convert-using-wmf2svg.sh
Windows,
convert-using-wmf2gd.bat
convert-using-wmf2svg.bat