docsplit

Break Apart Documents into Images, Text, Pages and PDFs

809
215
Ruby

==
__ ___ __
/ / ______________ / () /_
/ __ / __ / / / __ / / / __/
/ /
/ / /
/ / /
(
) // / / / /
_/_/___// .///_/
/_/

Docsplit is a command-line utility and Ruby library for splitting apart
documents into their component parts: searchable UTF-8 plain text, page
images or thumbnails in any format, PDFs, single pages, and document
metadata (title, author, number of pages…)

Installation:
gem install docsplit

For documentation, usage, and examples, see:
https://documentcloud.github.io/docsplit/

To suggest a feature or report a bug:
http://github.com/documentcloud/docsplit/issues/