How to Convert MS Word Document to HTML using C#

No.of Views509
Bookmarked0 times
Downloads 
Votes0
By  RRaveen   On  24 Dec 2010 07:12:34
Tag : Office Development , Miscellaneous
In this article I will show, how to convert a Microsoft word document t o HTML. Dot Net Framework has a great Microsoft Word object library to implement within 2.0 or later version of the Framework.
emailbookmarkadd commentsprint

Images in this article missing? We recently lost them in a site migration. We're working to restore these as you read this. Should you need an image in an emergency, please contact us at info@codegain.com

 

Introduction

In this article I will show, how to convert a Microsoft word document t o HTML. Dot Net Framework has a great Microsoft Word object library to implement within 2.0 or later version of the Framework.

Implementation

As usual create windows application and add two buttons, a textbox and an OpenFileDialog box. The OpenFileDialog help to select the file from the directory. The final designed form like below, 

Image Loading

A next step is very important in order to convert MS Word document to HTML using Microsoft Word Object Library in C#. For that you have to add Reference into your project.

1.    Right click on your project and select Add Reference
2.    And then Navigate “COM” Tab and select Microsoft Word 12.0 Object Library, shown as below 

Image Loading

Note: If you are using Windows Vista or Windows you may found Version 13 or 14.

As you have added Object Reference Library, you will able to see Reference list as like below,

Image Loading

Next write code for select source file, double click browse button in form and write code as like below,

C# Code

private void btnOpen_Click(object sender, EventArgs e)
        {
            if (openFileDialog1.ShowDialog() == DialogResult.OK)
            {
                txtPath.Text = openFileDialog1.FileName;

            }
        }

Now write code to convert selected word document to HTML. Just double click Convert button and write as like below,

C# Code

private void btnConvert_Click(object sender, EventArgs e)
        {
            object missingType = Type.Missing;
            object readOnly = true;
            object isVisible = false;
            object destionationDocFormat = 8;//wdFormatHTML OR 8 is HTML format.
            object sourceFile = txtPath.Text;
            object htmlfilePath = Path.Combine(System.IO.Path.GetDirectoryName(txtPath.Text), (Path.GetFileNameWithoutExtension(txtPath.Text) + ".htm"));

            //Open the word document in background  
            ApplicationClass appclass = new ApplicationClass();
            appclass.Documents.Open(ref sourceFile,
                                            ref readOnly,
                                            ref missingType, ref missingType, ref missingType,
                                            ref missingType, ref missingType, ref missingType,
                                            ref missingType, ref missingType, ref isVisible,
                                            ref missingType, ref missingType, ref missingType,
                                            ref missingType, ref missingType);
            appclass.Visible = false;
            Document document = appclass.ActiveDocument;
            try
            {

                document.SaveAs(ref htmlfilePath, ref destionationDocFormat, ref missingType,
                                ref missingType, ref missingType, ref missingType,
                                ref missingType, ref missingType, ref missingType,
                                ref missingType, ref missingType, ref missingType,
                                ref missingType, ref missingType, ref missingType,
                                ref missingType);
               

            }
            catch (Exception ex)
            {
                throw ex;
            }
            finally
            {
                
                if (document != null)
                {
                    document.Close(ref missingType, ref missingType, ref missingType);
                }
                Marshal.ReleaseComObject(appclass);
                appclass = null;
            }

        }

That's all. Now run application and select a word file and click convert button you will get converted HTML file in same location. One more additional tips of this conversion, the MS word document has support to convert many files formats. I have given all formats below,

public enum WdSaveFormat
    {
        wdFormatDocument97 = 0,
        wdFormatDocument = 0,
        wdFormatTemplate97 = 1,
        wdFormatTemplate = 1,
        wdFormatText = 2,
        wdFormatTextLineBreaks = 3,
        wdFormatDOSText = 4,
        wdFormatDOSTextLineBreaks = 5,
        wdFormatRTF = 6,
        wdFormatEncodedText = 7,
        wdFormatUnicodeText = 7,
        wdFormatHTML = 8,
        wdFormatWebArchive = 9,
        wdFormatFilteredHTML = 10,
        wdFormatXML = 11,
        wdFormatXMLDocument = 12,
        wdFormatXMLDocumentMacroEnabled = 13,
        wdFormatXMLTemplate = 14,
        wdFormatXMLTemplateMacroEnabled = 15,
        wdFormatDocumentDefault = 16,
        wdFormatPDF = 17,
        wdFormatXPS = 18,
        wdFormatFlatXML = 19,
        wdFormatFlatXMLMacroEnabled = 20,
        wdFormatFlatXMLTemplate = 21,
        wdFormatFlatXMLTemplateMacroEnabled = 22,
    }

Download Sample Project 

Download source files -37 kb

Conclusion

Through this article you have learned how to convert Microsoft Word Document to HTML file with Microsoft word Object Library using C#. hopes help and thank for reading.

 
Sign Up to vote for this article
 
About Author
 
RRaveen
Occupation-Software Engineer
Company-TGS
Member Type-Gold
Location-Singapore
Joined date-03 Jun 2009
Home Page-codegain.com
Blog Page-www.codegain.com
- B.Sc. degree in Computer Science. - 4+ years experience in Visual C#.net and VB.net - Obsessed in OOP style design and programming. - Designing and developing Network security tools. - Designing and developing a client/server application for sharing files among users in a way other than FTP protocol. - Designing and implementing GSM gateway applications and bulk messaging. - Windows Mobile and Symbian Programming - Having knowledge with ERP solutions
 
 
Other popularSectionarticles
Comments
By:RRaveenDate Of Posted:1/13/2011 8:02:33 AM
Solution
Hi, Just drag and drop the openfiledialogbox from toolbox and put, and then make sure in your browse button click code must be if(openFileDialog1.ShowDialog() == DialogResult.OK) {// here your code}. if above code, not work,please post question message board, our expert will give the solutions. thank you
By:chennaDate Of Posted:1/12/2011 10:49:33 PM
i am unable to use openfile dialog box in windows form
i am unable to keep the open dialogebox control on the windows form please tell me the reason how to put the control given in your example i.e How to convert microsoftword document to html in c#? please give me reply as early as possible
Leave a Reply
Title:
Display Name:
Email:
(not display in page for the security purphase)
Website:
Message:
Please refresh your screen using Ctrl+F5
If you can't read this number refresh your screen
Please input the anti-spam code that you can read in the image.
^ Scroll to Top