We are excited to announce that
GroupDocs.Parser is coming soon to
Java platform as GroupDocs.Parser for Java. It will be an easy to use back-end API that will permit the users to extract
raw and formatted text from the supported document formats. Besides, it will also allow the users to extract the metadata from the popular document formats. GroupDocs.Parser for Java will soon be available for download.
Salient Features of GroupDocs.Parser for Java
GroupDocs.Parser for Java will come with all the features that are supported by GroupDocs.Parser product family. The most notable features of the API include:
- Extracting Text from Documents
- Extracting Formatted Text from Documents
- Extracting Highlights
- Extracting Structured Text from Documents
- Searching a Text
- Searching the Whole Word
- Searching Text with a Regular Expression
- Extracting Metadata from the Documents
- Working with Containers such as ZIP, OST and Email Containers
- Encoding Detector
- Loggers
- Media Type Detectors
The API will initially support the following document types for text extraction:
- Text Documents
- Spreadsheet Documents
- Presentation Documents
- PDF Documents
- Email Messages
- Markdown Documents
- Electronic Publication Documents
- FictionBook Documents
- Microsoft Compiled HTML Help
- OneNote Documents
First Version Availability
We are finalizing the first release of GroupDocs.Parser for Java and hoping that you will be able to grab it very soon. Please stay tuned for further updates. We would be happy to hear your queries or suggestions at
GroupDocs.Parser forum.