We are pleased to announce that the first version of GroupDocs.Parser for Java has been released. GroupDocs.Parser for Java allows the Java developers to extract raw and formatted text from the popular document formats. The API also supports working with containers such as ZIP and email containers. You can also access the metadata attached to the documents using a few lines of code. Please continue to read more about the features and the file formats supported by the API.
Supported Features
Following are the salient features exposed by GroupDocs.Parser for Java.
Extract text from various document formats
Extract main document properties
Extract text and metadata from containers (PST, OST, ZIP containers are currently supported)
Extract text and metadata from mail servers (POP, IMAP and Microsoft Exchange Server are supported)
Extract formatted text. Plain text, Markdown, and HTML formatters are present
Extract structured text
Support password protected document (ability to provide the password if it is required)
Service functions like encoding detection, media type detection and the ability to connect the logger
Search text in documents
Text analysis API (Pdf format is currently supported)
For more details on supported features, please visit the article: Features Overview.
Supported File Formats
The following is the list of file formats supported by GroupDocs.Parser.
Text Document Formats (.doc/.docx/.dot/.rtf/.docm/.odt/.xml/.txt/.md)
Repetition of data can diminish the worth of an article. Working as a writer, one must follow DRY (don’t repeat yourself) principle. Cross reading the articles, again and again, may cost a lot of time. Counting the statistics of word’s occurrences can endeavor the goal but again it’s hard to do it manually. Eventually, you need to read the whole article and keep track of the words. GroupDocs.Parser may help in this case. In order to elaborate real-life needs, we have envisaged some real-life cases. Please feel free to visit the article: Working with Business Cases.
Available Channels and Resources
Here are a few channels and resources for you to download, learn, try and get technical support on GroupDocs.Parser:
Installation – Install GroupDocs.Parser from Maven