Top Java Frameworks & Libraries for document processing

📸 A well documented, high-level Android interface that makes capturing pictures and videos easy, addressing all of the common issues and needs. Real-time filters,...

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other...

Natural language processing pipeline for book-length documents

CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete information about the rendered page suitable for furth...

Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTML file or further processe...

Analysis plugin for ElasticSearch providing capability for processing inline annotations in documents....

Multi-functional, cross-platform, well documented framework cutting boilerplate and speeding up your software development process...

Power File Explorer for Android, built-in images/document preview, media player, PDF+image viewer, text editor, apps, processes, traffic manager, compress/descompr...

Apache NIFI processor that converts EDI ASC X12 and EDIFACT documents into XML

TopicModel4J: A Java Package for Topic Models (Contain LDA, Collapsed Variational Bayesian Inference for LDA, author-topic model, BTM, dirichlet multinomial mixtur...

the Spin-Suite project is a library for Android based in ADempiere business model, it is responsible of: Synchronizing. Role access. Display menu. Document actions...

Natural language processing pipeline for finding vitals signs in documents.

Real-time processing engine that extracts domain concepts from documents and inserts those concepts into a knowledge graph...

The Java source code for CosmosDB Core Change Feed Processor

Intelligent document capture platform which automatically captures content from the documents and makes that available to use in your business processes....

A Java-based library which allows in-memory processing of financial data storaged in native XBRL documents. We have tested this tool on U.S. SEC files....

A lazy-loading DOM implementation for processing huge XML documents

This platform processes XML documents into fully indexed and preserialized SOLR documents which are then made available through a configurable RESTful API...

The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text (such as a sentenc...

Document Processing Services

a machine learning approach for processing mathematical language in scientific documents

Support for processing Chinese documents

Apache Nifi processor to convert between XML, JSON, CSV and YAML documents

content processing framework / document processing pipeline for ETL to a search engine.

Reasearch On Bnagla Natural Language Processing . Contains Bangla Stemmer.

Cordova plugin for providing document scanning and processing capability to Android and iOS platforms...

A GATE plugin to simplify batch processing and using various document sources / sinks

Tools for processing documents written in the Inuktut language

Valiant is a Java based transformation engine to extract from XML documents RDF data using XSLT processing....

Webservice to process and deliver IRIX Documents. IRIX is the International Radiological Information eXchange format standard developed by the IAEA....

A service and web management interface which handles the process of tagging / organization, long term storage (eg. Google Drive) and general management of scanned...

Using the full gamut of the Android framework,trending libraries, Material design and a variety of Google Play services and Firebase services. Demonstrate whole de...

Computes the co-occurrence rate between a target word and each phrase in a collection of documents. Each text is processed on separate Amazon virtual machines via...

Tools for processing documents written in the Inuktut language

XPath processor for Avro Documents

Documentum is an enterprise content management platform. Documentum provides management capabilities for all types of content. The core of Documentum is a reposito...

NOTICE This repository contains the public FTC SDK for the SKYSTONE (2019-2020) competition season. If you are looking for the current season's FTC SDK software, p...

This repository contains implementation to process private data shares collected according to the Exposure Notification Private Analytics protocol. It assumes priv...

Data Prepper is a data ingestion component that pre-processes documents before indexing them in OpenSearch....

The goal of the current application is to extract Concept-Value pairs for metrics measured during an echocardiogram study. The input is a text document to be proce...

The European Single Procurement Document enables accelerated processing of preliminary evidence in EU public procurement. The ESPD EDM enables applications to inte...