version 6   2 April 2014

Docvert takes word processor files (typically .doc) and converts them to OpenDocument and clean HTML.

Web Service receives .doc file and converts it to a OpenDocument (ODF) which can then be converted to HTML, DocBook, RSS, or any XML format.

The resulting OpenDocument is then optionally converted to HTML or any XML. This is done with XML Pipelines, an approach that supports XSLT, breaking up content over headings or sections, and saving those results to multiple files (e.g., chapter1.html, chapter2.html…).

The result is returned in a .zip file.

Docvert has a user-friendly interface, and it's easy to integrate with other software as it uses a simple REST-style interface. It's released under the GPL v3 so although it's Free Software there's no legal problems developing proprietary software ontop of the Web Service interface. The XML produced is easier to understand and more structured than the OOXML or .DOC formats.