Search Synonyms in Multiple Files using Java

We recently have discussed, how to get all the synonyms of any word. It would be wonderful if we could locate these synonyms within many different documents. In this article, we will see how to search any word and its synonyms in multiple files using Java.

The following are the topics covered below:

Java API – Search Synonyms in Multiple Files

GroupDocs.Search showcases the Java API (GroupDocs.Search for Java). It allows searching words and their synonyms in various multiple files of the specified folder. It supports a long list of different file formats and various search techniques. Some of these features are mentioned below and you can use them in combination to achieve your target:

  • Boolean Search
  • Case-Sensitive Search
  • Highlight Search Results
  • Homophone Search
  • Phrase Search
  • Regular Expressions Search
  • Search by Chunks
  • Synonym Search

Download or Configure

You may download the JAR file from the downloads section, or just get the latest repository and dependency configurations for the pox.xml of your maven-based Java applications.

<repository>
	<id>GroupDocsJavaAPI</id>
	<name>GroupDocs Java API</name>
	<url>http://repository.groupdocs.com/repo/</url>
</repository>
<dependency>
        <groupId>com.groupdocs</groupId>
        <artifactId>groupdocs-search</artifactId>
        <version>21.8</version> 
</dependency>

Find Synonyms in Multiple Files using Java

Let’s quickly move to search synonyms within files. The following steps show how to search synonyms (words with similar meanings) in files within a folder using Java:

  • Define the index folder, document’s folder and query (the word to search).
  • Create an index using defined index folder using Index class.
  • Add the documents’ folder to the index.
  • Enable the Synonym Search using SearchOptions.
  • Call the search method of Index class and pass the query with search options.
  • Print the summary using the properties of the retrived SearchResult class.

The following source code shows how to find all the synonyms within files using Java:

The following is the output of the above code:

Query: make
Documents: 3
Word & Synonym Occurrences: 44 

From the search results obtained in the above step, you can get the information regarding each word and synonym of the search. The following steps present the results in detail after getting all the synonyms and their number of occurrences within each document:

  • Firstly, perform the search to get the SearchResult.
  • Tranverse the search result to work with each FoundDocument.
  • Print the repective properties of each FoundDocument.
  • Now, extract and then traverse the FoundDocumentField within each FoundDocument.
  • Each FoundDocumentField has its terms, occurrences, and other properties in it. Use respective getter.

The following source code displays the result of the synonym search along with the number of occurrences of each searched term in Java.

The following is the output of the above code:

Query: make
Documents: 2
Total occurrences: 22

Document: C:/documents/sample.docx
Occurrences: 13
    Field: content
    Occurrences: 13
        make  -  2
        have  -  1
        get  -  2
        do  -  8
- - - - - - - - - - - - - - - - 
Document: C:/documents/sample.txt
Occurrences: 11
    Field: content
    Occurrences: 11
        make  -  1
        have  -  2
        get  -  1
        do  -  7
- - - - - - - - - - - - - - - - 
Document: C:/documents/sample.pdf
Occurrences: 20
    Field: content
    Occurrences: 20
        make  -  2
        have  -  2
        get  -  2
        do  -  14 

Search Synonyms and Printing Results in Java – Complete Code

Let’s combine the above two steps, so here is the complete source code. Firstly, it finds all the synonyms according to the provided query. Then, it prints all the occurrences of every synonym in each document in Java.

Get a Free API License

You can get a free temporary license in order to use the API without the evaluation limitations.

Conclusion

To summarize, we discussed how to search any word along with its synonym in multiple documents using Java. Most importantly, now you can try developing your own Java App for searching just like GroupDocs.Search App.

Learn more about the Java Search Automation API from the documentation. To experience the features, try examples from the GitHub repository. Feel free to reach us for any query via the forum.

See Also