Category Archive: GroupDocs.Parser Product Family

Official blog with announcements of latest supported features, hot fixes, technical articles, tips and videos of GroupDocs.Text – A text extraction API for .NET.

Extract Data from Database Files using C#

Posted on November 8, 2019 by Usman Aziz

The database is considered to be an integral part of most of the applications. Be it a desktop, web or mobile application, database plays a vital role in storing, accessing and manipulating the data. There are many database management systems that allow creating and managing databases for you.

However, there could be a scenario when you need a way to extract data from database files, i.e. .db file, without installing a database management system or writing the SQL queries. How … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged Extract data from database, GroupDocs.Parser for .NET, SQLite DB

Extract Data from Invoices or Receipts in C#

Posted on October 24, 2019 by Usman Aziz

Invoices and receipts are the documents that are used to record the transactions in a particular format when buying or selling of the services or goods is involved. Things have gone digital and with the popularity of online shopping, digital invoices are widely used. Processing a number of digital invoices and extracting the information manually is a complex as well as time taking process. Thus, you need a faster yet efficient way for such a case. So in this article, … Continue Reading

Posted in GroupDocs.Parser Product Family |

Count Words and Occurrences of Each Word in a Document using C#

Posted on October 16, 2019 by Usman Aziz

Repetition of data can diminish the worth of the content. Working as a writer, you must follow DRY (don’t repeat yourself) principle. The statistics such as word count or the number of occurrences of each word can let you analyze the content but it’s hard to do it manually for multiple documents. So in this article, I’ll demonstrate how to programmatically count words and the number of occurrences of each word in PDF, Word, Excel, PowerPoint, … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged GroupDocs.Parser for .NET

Extract Images from PDF Documents using C#

Posted on October 4, 2019 by Usman Aziz

Portable Document Format (PDF) is a popular and widely used document format developed by Adobe. The PDF documents can contain a variety of content including formatted text, images, annotations, form fields, etc. Parsing PDF document programmatically is a popular use case and there are multiple ways of extracting the text. However, extracting images from a PDF document is a complex task. This article demonstrates how easily you can extract images from the PDF documents programmatically in C# using GroupDocs.Parser for … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged extract images, extract images from PDF, extract images from PDF in csharp, extract images in csharp, parse PDF and extract images

Introducing API v2 of GroupDocs.Parser for .NET

Posted on October 1, 2019 by Usman Aziz

The all-new API v2 of GroupDocs.Parser for .NET has been released! It would be a piece of breaking news for those who are already using our document parsing API as well as who are looking for an easy to use solution for extracting text, images, and metadata from PDF, word processing documents, spreadsheets, presentations, emails, EPUB & ZIP file formats.

What’s new in the API v2?

We have done some major updates at … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged GroupDocs.Parser for .NET Releases

Extract Data Fields from the Documents using GroupDocs.Parser Product Family

Posted on June 27, 2019 by Usman Aziz

Hello everyone! I am back with something new and exciting for the developers who use to deal with the automated data extraction from the documents. A few years back, we released GroupDocs.Parser API which aimed to extract the text from various document formats. We kept on adding the features to it and today, it has become a giant API that provides a wide range of features including formatted text extraction, highlighted and structured text extraction, metadata extraction, extraction of images … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged data extraction, Document Parser, document parsing API, extract data fields, extract data tables, GroupDocs.Parser for .NET Releases, GroupDocs.Parser for Java Releases, text extraction API, Text Extractor

Extract Tables from PDF Documents using GroupDocs.Parser for .NET 18.12

Posted on December 20, 2018 by Usman Aziz

It is our pleasure to announce the release of version 18.12 of GroupDocs.Parser for .NET. The latest version allows you to extract the tables from PDF documents. Furthermore, we have added the support of extracting text and metadata from text and presentation templates. For more details, please have a look at the release notes of version 18.12.

Features Introduced

Extracting Tables from PDF Documents

This feature is very useful when you want to extract only the tables form a… Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged .NET Text Extractor, Document Parser, GroupDocs.Parser for .NET, GroupDocs.Parser for .NET Releases, metadata extractor, text analysis API

Support for Text and Presentation Templates in GroupDocs.Parser for Java 18.12

Posted on December 20, 2018 by Usman Aziz

We are delighted to announce the release of GroupDocs.Parser for Java 18.12. The latest version allows you to extract the tables from PDF documents. Furthermore, we have added the support of extracting text and metadata from text and presentation templates. For more details, please have a look at the release notes of version 18.12.

Features Introduced

Extracting Tables from PDF Documents

This feature is very useful when you want to extract only the tables form a PDF document. For extracting… Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged Document Parser, document parsing API, document text extraction, GroupDocs.Parser for Java, GroupDocs.Parser for Java Releases, Java Text Extractor, text analysis API

Improved Text Area Extraction for PDF Documents in GroupDocs.Parser for Java 18.11

Posted on November 19, 2018 by Usman Aziz

We are delighted to announce the release of GroupDocs.Parser for Java 18.11. The latest version came up with one new feature and three enhancements. It allows you to get information about the supported extractors for a document. Furthermore, we have improved the text area extraction for the PDF documents. For more details, please have a look at the release notes of version 18.11.

Features Introduced

Getting Information of Supported Extractors for a Document

This feature helps to get the information… Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged Document Parser, document text extraction, document text parser, GroupDocs.Parser for Java Releases, Java API, Java Text Extractor, text extraction API, text extractor API for Java

Get Information of Supported Extractors for a Document using GroupDocs.Parser for .NET 18.11

Posted on November 19, 2018 by Usman Aziz

We are pleased to announce the release of version 18.11 of GroupDocs.Parser for .NET. The latest version came up with one new feature and three enhancements. It allows you to get information about the supported extractors for a document. Furthermore, we have improved the text area extraction for the PDF documents. For more details, please have a look at the release notes of version 18.11.

Features Introduced

Getting Information of Supported Extractors for a Document

This feature helps to… Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged .NET Text Extractor, Document Parser, document parsing API, document text extraction, GroupDocs.Parser for .NET, GroupDocs.Parser for .NET Releases, text parser

Category Archive: GroupDocs.Parser Product Family

Extract Data from Database Files using C#

Extract Data from Invoices or Receipts in C#

Count Words and Occurrences of Each Word in a Document using C#

Extract Images from PDF Documents using C#

Introducing API v2 of GroupDocs.Parser for .NET

What’s new in the API v2?

Extract Data Fields from the Documents using GroupDocs.Parser Product Family

Extract Tables from PDF Documents using GroupDocs.Parser for .NET 18.12

Features Introduced

Extracting Tables from PDF Documents

Support for Text and Presentation Templates in GroupDocs.Parser for Java 18.12

Features Introduced

Extracting Tables from PDF Documents

Improved Text Area Extraction for PDF Documents in GroupDocs.Parser for Java 18.11

Features Introduced

Getting Information of Supported Extractors for a Document

Get Information of Supported Extractors for a Document using GroupDocs.Parser for .NET 18.11

Features Introduced

Getting Information of Supported Extractors for a Document

Search

Follow Us

Categories