Today we are about to learn some ways to programmatically remove or entirely clean metadata of documents as well as images using C#. In an earlier post, we discussed removing the selective as well as all the available metadata properties from documents and images using Java. It is sometimes important to hide personal information from the receiver, that is attached to the document. Following are the topics that will help you clean your files from metadata using C#.
- .NET Metadata Cleaner API
- Remove Metadata from Documents using C#
- Clean Metadata from Images using C#
- Remove Selective Metadata from Documents and Images using C#
.NET Metadata Removing API
To achieve what is planned, I will use GroupDocs.Metadata for .NET API that allows .NET developers to add, modify, extract, remove, or completely metadata from many supported formats of documents, images, and other files. The API supports metadata standards like EXIF, XMP, IPTC, ID3 tag, etc. You may download DLLs or MSI installer, or install it via NuGet.
Install-Package GroupDocs.Metadata
Remove Metadata from Documents using C#
In order to remove all the metadata properties without applying any specific filter, use the Sanitize method. The following are the steps to clean metadata from the documents like DOCX, PDF, XLSX, etc using GroupDocs.Metadata for .NET.
- Start by creating Metadata class object and pass the path of the target document as the parameter.
- Use Sanitize method to clear all the available metadata. It returns the number of the removed metadata properties.
- Call Save method to save the output file with removed metadata.
The following C# code sample shows how to remove and clear metadata from a PDF document.
Remove Metadata from Images using C#
Whether you want to remove metadata from your documents or from your image files, the process will remain the same. Only the source document will be changed accordingly.
- Create the object of the Metadata class and pass the document path as the parameter.
- Call the Sanitize method to remove any available metadata properties.
- Save the output file using the Save method.
The following C# code sample shows how to remove metadata from a JPG image.
Remove Selective Metadata from Documents and Images using C#
If it is not required to remove all the available metadata from the files, and we just want to remove only the selective metadata properties. The following steps allow you to locate and remove the targetted metadata properties using the specific name of the property.
- Create an object of Metadata class to load the source document or image file.
- Create personalized specifications to find the metadata properties.
- Call the RemoveProperties method with the created personalized specifications.
- Save the output file using the Save method.
Conclusion
We learned the ways to remove metadata from documents and images using C#. After going through this article, you would feel comfortable building your own metadata cleaner application using .NET. It can support metadata removal from MS Word document formats, spreadsheets, presentations, PDF files, images, emails, eBooks, drawings, zip files, and many more file formats that are supported by the API.
You may further explore the .NET Metadata Manipulation API from the documentation.