Search Synonyms in Multiple Files using C#

In another article, we have seen that what are synonyms, and how to get all the synonyms of any word. What about finding these synonyms within different documents? This article will guide you about how to search the synonyms of any specific query (word) in multiple files using C#.

The following topics will be covered below:

.NET API for Searching Synonyms in Multiple Files

GroupDocs.Search provides the .NET API that allows searching any word and its synonyms in multiple files of the specified folder. I will be using this API in the shown examples of this article. It allows you to search over a large list of document formats. Along with finding the synonyms, GroupDocs.Search for .NET also supports some more search techniques that include:

  • Fuzzy Search
  • Case-Sensitive Search
  • Homophone Search
  • Regular Expressions Search
  • Wild Card Search

You can download the DLLs or MSI installer from the downloads section or install the API in your .NET application via NuGet.

PM> Install-Package GroupDocs.Search

Find Synonyms in Multiple Files using C#

The steps show how to search synonyms (words with similar meanings) in files within a folder using C#.

  • Define the search query, index folder, and the document’s folder.
  • Create index with defined index Folder using Index class.
  • Add the document’s folder to the index.
  • Create the SearchOptions and set the UseSynonymSearch to true.
  • Call the Search method of Index class and pass the query and search options.
  • To print the summary, use the properties of the retrived SearchResult.

The source code shows how to find all the synonyms within all the files of a folder using C#

Query: make
Documents: 2
Occurrences: 22

The following steps print the results in detail after getting all the synonyms and their number of occurrences in each document.

  • Traverse the search results that are retrieved using the above code.
  • Get each FoundDocument using the GetFoundDocument method.
  • Print the repective properties of each FoundDocument.
  • Traverse the FoundFields within each FoundDocument to get Found Document Field.
  • From each FoundDocumentField, you can get its terms and its occurrences count within each document.

The following source code prints the synonym search results along with the number of occurrences of each searched term using C#.

Query: make
Documents: 2
Total occurrences: 22

Document: C:/documents/sample.docx
Occurrences: 6
    Field: content
    Occurrences: 6
        make             1
        get                 2
        cause            1
        do                  2
Document: C:/documents/sample.txt
Occurrences: 16
    Field: content
    Occurrences: 16
        get                  4
        cause             1
        do                  11

Search Synonyms and Printing Results using C# – Complete Code

Here is the complete source code that first finds all the synonyms according to the provided query, and then prints all the occurrences of all the synonyms in each document within that folder using C#.

Conclusion

To conclude, you have learned how to find the specific words and also their synonyms in multiple documents within the specified folder using C#. You can try to develop your own .NET application for searching any word and its synonyms within multiple files.

Learn more about the .NET Search Automation API from the documentation. To experience the features, you can have a look at examples on the GitHub repository. Reach us for any query via the forum.

See Also