Tag Archives: Text Extractor

Extract Data Fields from the Documents using GroupDocs.Parser Product Family

Hello everyone! I am back with something new and exciting for the developers who use to deal with the automated data extraction from the documents. A few years back, we released GroupDocs.Parser API which aimed to extract the text from various document formats. We kept on adding the features to it and today, it has become a giant API that provides a wide range of features including formatted text extraction, highlighted and structured text extraction, metadata extraction, extraction of images … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , , , ,

Upcoming Release of GroupDocs.Parser for Java

GroupDocs.Parser for Java
We are excited to announce that GroupDocs.Parser is coming soon to Java platform as GroupDocs.Parser for Java. It will be an easy to use back-end API that will permit the users to extract raw and formatted text from the supported document formats. Besides, it will also allow the users to extract the metadata from the popular document formats. GroupDocs.Parser for Java will soon be available for download.

Salient Features of GroupDocs.Parser for Java

GroupDocs.Parser for Java will come with all… Continue Reading
Posted in GroupDocs.Parser Product Family | Tagged , , , , , , , ,

GroupDocs.Text for .NET has been Renamed to GroupDocs.Parser for .NET

GroupDocs.Text for .NETWe are pleased to announce that GroupDocs.Text for .NET has been renamed to GroupDocs.Parser for .NET. We have published the first monthly release as GroupDocs.Parser for .NET 18.5 after renaming of the API. The latest release has come with a few changes and a couple of enhancements. Please continue to read about the changes and the enhancements that we have made in version 18.5.

Important to Know

It is important to inform you that the renaming of the API… Continue Reading
Posted in GroupDocs.Parser Product Family | Tagged , , , , , ,

Extract TOC from EPUB Documents using GroupDocs.Text for .NET 18.4

GroupDocs.Text for .NETIt gives us immense pleasure to announce the release of version 18.4 of GroupDocs.Text for .NET. The latest version allows extracting the table of contents from the EPUB documents. Furthermore, we have added the feature of detecting media type of .one file. Following sections provide details about the newly added features.

Extracting TOC from EPUB Documents

Using version 18.4, you can now extract TOC from the EPUB documents. To access the TOC, TableOfContents property of EpubPackage class is used. Once… Continue Reading
Posted in GroupDocs.Parser Product Family | Tagged , , , ,

Extract Formatted Text from CHM Documents using GroupDocs.Text for .NET 18.3

GroupDocs.Text for .NETWe keep looking forward to bringing you more features and therefore, we have released version 18.3 of GroupDocs.Text for .NET providing the support of extracting formatted text from CHM documents. The latest version also allows you to extract text by pages and extract table of content from CHM documents. The following sections will provide you the details about the new features of the API.

Extracting Formatted Text from CHM Documents

GroupDocs.Text provides a couple of ways to extract formatted text… Continue Reading
Posted in GroupDocs.Parser Product Family | Tagged , , , ,

Extract Emails using POP3 and IMAP Protocol – GroupDocs.Text for .NET 17.10

GroupDocs.Text for .NETWe are pleased to announce the release of version 17.10 of the GroupDocs.Text for .NET API. The latest release removes some obsolete methods from the API while provides some additional properties, updated parameters and improved performance of the API. The new feature added to the API in this release is the ability to extract text from the email server using POP3 and IMAP protocol. Just download the latest release or update your existing application to this release and enjoy… Continue Reading
Posted in GroupDocs.Parser Product Family | Tagged , , , ,