Compare Text, Word, and PDF Files with Java Difference Library

After going through this article, we will be able to compare text files, Word files, PDF files, and other documents in Java-based applications. By using this feature, we can compare invoices, contracts, presentations, AutoCAD designs, price lists, or programming files. We will also the privilege to highlight the identified changes and have the option to either accept or reject any change. We can even build our own document comparison tool similar to the one launched by GroupDocs, using the document comparison API for Java.

Below, you will go through the following topics:

Java Document Comparison API

As a pre-requisite, you may get GroupDocs.Comparison for Java from the downloads section. Also, you can just add the following in your pom.xml in case of maven based applications:

Repository

<repository>
	<id>GroupDocsJavaAPI</id>
	<name>GroupDocs Java API</name>
	<url>http://repository.groupdocs.com/repo/</url>
</repository>

Dependency

<dependency>
        <groupId>com.groupdocs</groupId>
        <artifactId>groupdocs-comparison</artifactId>
        <version>20.4</version> 
</dependency>

Compare Word Files and Show Differences using Java

Steps below will show you to compare any two Word documents in just a few lines of Java code. As a result, you will get the resultant document that will be highlighting the identified changes.

  • Initialize the Comparer object with the source document path.
  • Add the second document to compare using the add method.
  • Call the compare method to get the result of the comparison. The compare method takes the name of the output document as a parameter.
// Compare two Word files from the provided location on disk
Comparer comparer = new Comparer("source.docx");
try {
    comparer.add("target.docx");
    comparer.compare("comparison.docx");
}
finally {
    comparer.dispose();
}

Here I am displaying the resultant Word document generated by the above code, and it contains the highlighted differences of the compared two Word documents. The deleted content will be marked in RED, added content will be displayed in Blue, however, Green shows the modified content.

word-file-text-comparison-and-show-dirffer

Compare Word Files for Text using Stream

You can similarly pass the document as a stream to the Comparer class to get it compared with the second document. Here is the Java code to give you a clear idea:

// Compare two Word file using Stream
Comparer comparer = new Comparer(new FileInputStream("source.docx"));
try {
    comparer.add(new FileInputStream("target.docx"));
    comparer.compare(new FileOutputStream("result.docx"));
} 
finally {
    comparer.dispose();
}

Accept or Reject the Compared Changes in Word File using Java

After successfully highlighting the identified differences, you have the option to either accept or reject any change. Just to show as an example, I am accepting and rejecting the changes alternatively. You may display each change one by one with the similar code and take your decisions to accept/reject each change according to your requirement.

// Accept or Reject the identified changes of Word document in Java
Comparer comparer = new Comparer(source);
try {
    comparer.add(target);
    comparer.compare();
    ChangeInfo[] changes = comparer.getChanges();
    System.out.println("changes.length: " + changes.length + ".");
    // Accept or Reject the changes
    for (int n = 0; n < changes.length; n++) {
    	if (n % 2 == 0) {
    		changes[n].setComparisonAction(ComparisonAction.ACCEPT);
    	}
    	else {
    		changes[n].setComparisonAction(ComparisonAction.REJECT);
    	}
    }
    // Apply your decisions to get the resultant document.
    comparer.applyChanges(outputFileName, new SaveOptions(), new ApplyChangeOptions(changes));
}
finally {
    comparer.dispose();
}

Compare Text Files and Show Differences using Java

Using the Comparer class, we can also compare any text file. Below is the similar code for comparing two text files in Java. Steps are exactly the same as comparing any other two documents:

  • Start with passing the text file to the Comparer class.
  • Add the second file using the add method.
  • Call the compare method.
// Compare two text files to identify and highlight changes.
Comparer comparer = new Comparer("source.txt");
try {
    comparer.add("target.txt");
    comparer.compare("comparison.txt");
}
finally {
    comparer.dispose();
}

Here is the output document that shows the comparison result of matching two text files using the above code.

Compare Text Files using Java

Compare PDF Files for Text Difference using Java

We can compare the PDF files using the same above code, and by just changing the file extensions to “.pdf”. Just to mention, the code below compare two pdf files and shows differences in Java.

// Compare two PDF file using Stream
Comparer comparer = new Comparer(new FileInputStream("source.pdf"));
comparer.add(new FileInputStream("target.pdf"));
comparer.compare(new FileOutputStream("result.pdf"));

Below is the outcome after comparing the PDF files.

PDF File Text Comparison

See Also

Many other open-source examples are publicly available at GitHub Repository. You may download and quickly run the examples using the getting started guide. In case of any query, look at the documentation or reach us at any time on the forum.