Render Word documents as Clean HTML using C#

Cleaning and Minification of HTML improve the load time and bandwidth usage of the webpages. It is observed that some unnecessary code is injected when one document is converted to HTML format using some tools. You can get rid of this unwanted code within your .NET applications. This article discusses how to render Word documents to minified HTML using C#.

Render Word as Clean HTML using C#

.NET API to Render as Minified HTML

GroupDocs.Viewer provides a document viewing API that allows rendering various documents into HTML, PDF, and image formats within the .NET application. I will use this API in the examples to convert the DOCX file into a clean HTML file.

You can download the DLLs or MSI installer from the downloads section or install the API in your .NET application via NuGet.

PM> Install-Package GroupDocs.Viewer

Render Word DOC/DOCX to Minified HTML using C#

HTML files can be obtained either with embedded or external resources using respective methods. The following steps show how to convert the Word document (DOC/DOCX) into minified HTML using C#.

  • Load the DOCX file using the Viewer class.
  • Prepare the HTML rendering options using the HtmlViewOptions class.
  • Enable the Minify option by setting it to true.
  • Use the View() with created options to render DOCX file as minified HTML.

The following C# code example renders the Word DOCX file into minified HTML.

Get a Free API License

You can use the APIs for free without evaluation limitations by getting a temporary license.

Conclusion

To sum up, we discussed how to render the DOC/DOCX files as minified HTML using C#. You can build your own Online Converter and Cleaner that allows users to convert the documents to minified HTML. Besides, you can learn more about GroupDocs.Viewer for .NET from its documentation. For queries, contact us via the forum.

See Also