1

I am working on an application for which i need to convert .docx and .pdf file to .txt file with basic formatting. I searched it in internet but couldn't find any free third party dlls. Can any one suggest me best way and some dlls reference for this.

Thanks in Advance

Navin 431
  • 49
  • 1
  • 9

2 Answers2

1

http://support.microsoft.com/kb/316383 describes what you want to do with .docx files very well. http://visualbasic.about.com/od/quicktips/qt/disppdf.htm describes the same, but with .pdf files.

Once you have read files into your code, output to a txt file using VB.NET's built in file writing functions.

Shivam Sarodia
  • 417
  • 1
  • 4
  • 20
0

The code below will handle the job for you. It is something I wrote for the big boss haha. I hope it helps. The code reads the first cell in the work sheet as the folder where docx files are present and then converts them to txt files one by one saving in the same folder.

Const wdFormatText = 2

If Not Len(Cells(1, "A").Value) > 0 Or Dir(Cells(1, "A").Value, vbDirectory) = "" Then
    MsgBox ("Invalid Folder")
    Exit Sub
End If

Dim StrFile As String

    StrFile = Dir(Cells(1, "A").Value & "\*.docx")
    Do While Len(StrFile) > 0
        Set objWord = CreateObject("Word.Application")
        Set objDoc = objWord.Documents.Open(Cells(1, "A").Value & "\" & StrFile, False, True)
        objDoc.SaveAs Cells(1, "A").Value & "\" & StrFile & ".txt", wdFormatText
        objWord.Quit
        StrFile = Dir
    Loop
fatih
  • 183
  • 1
  • 9