有没有办法把word文档升级到2010
场景:我有大约 14000 个 Word 文档需要从“Microsoft Word 97 - 2003 文档”转换为“Microsoft Word 文档”。换句话说,升级到 2010 格式 (.docx)。
问题:有没有一种简单的方法可以使用 API 或其他方法来做到这一点?
注意:我只能找到一个将文档转换为 .docx 的 Microsoft 程序,但它们仍然以兼容模式打开。如果可以将它们转换为新格式,那就太好了。当您打开旧文档时获得相同的功能,它为您提供了转换它的选项。
编辑:刚刚找到http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word._document.convert.aspx 了解如何使用它。
EDIT2:这是我当前用于转换文档的功能,
Private Sub btnConvert_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnConvert.Click
FolderBrowserDialog1.ShowDialog()
Dim mainThread As Thread
If Not String.IsNullOrEmpty(FolderBrowserDialog1.SelectedPath) Then
lstFiles.Clear()
DirSearch(FolderBrowserDialog1.SelectedPath)
ThreadPool.SetMaxThreads(1, 1)
lstFiles.RemoveAll(Function(y) y.Contains(".docx"))
TextBox1.Text += "Conversion started at " & DateTime.Now().ToString & Environment.NewLine
For Each x In lstFiles
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ConvertDoc), x)
Next
End If
End Sub
Private Sub ConvertDoc(ByVal path As String)
Dim word As New Microsoft.Office.Interop.Word.Application
Dim doc As Microsoft.Office.Interop.Word.Document
word.Visible = False
Try
Debug.Print(path)
doc = word.Documents.Open(path, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing)
doc.Convert()
Catch ex As Exception
''do nothing
Finally
doc.Close()
word.Quit()
End Try
End Sub`
它允许我选择一个路径,然后查找子文件夹中的所有文档文件。该代码并不重要,所有要转换的文件都在 lstFiles 中。目前唯一的问题是,即使只有 10 个文档,转换也需要很长时间。我应该为每个文档使用一个单词应用程序而不是重复使用它吗?有什么建议吗?
此外,它会在大约 2 或 3 次转换后打开单词并开始闪烁,但会继续转换。
EDIT3:对上面的代码进行了一些调整,它运行得更干净。不过转换 8 个文件需要 1 分 10 秒。考虑到我有 14000 个,我需要转换此方法将花费相当长的时间。
EDIT4:再次更改代码。现在使用线程池。似乎跑得更快了一些。仍然需要在更好的计算机上运行来转换所有文档。或者按文件夹慢慢做。有人能想到其他方法来优化这个吗?
Scenario: I have about 14000 word documents that need to be converted from "Microsoft Word 97 - 2003 Document" to "Microsoft Word Document". In other words upgraded to 2010 format (.docx).
Question: Is there an easy way to do this using API's or something?
Note: I've only been able to find a microsoft program that converts the documents to .docx but they still open in compatability mode. It would be nice if they could just be converted to the new format. Same functionality you get when you open an old document and it gives you the option to convert it.
Edit: Just found http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word._document.convert.aspx looking into how to use it.
EDIT2: This is my current function for converting the documents
Private Sub btnConvert_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnConvert.Click
FolderBrowserDialog1.ShowDialog()
Dim mainThread As Thread
If Not String.IsNullOrEmpty(FolderBrowserDialog1.SelectedPath) Then
lstFiles.Clear()
DirSearch(FolderBrowserDialog1.SelectedPath)
ThreadPool.SetMaxThreads(1, 1)
lstFiles.RemoveAll(Function(y) y.Contains(".docx"))
TextBox1.Text += "Conversion started at " & DateTime.Now().ToString & Environment.NewLine
For Each x In lstFiles
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ConvertDoc), x)
Next
End If
End Sub
Private Sub ConvertDoc(ByVal path As String)
Dim word As New Microsoft.Office.Interop.Word.Application
Dim doc As Microsoft.Office.Interop.Word.Document
word.Visible = False
Try
Debug.Print(path)
doc = word.Documents.Open(path, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing)
doc.Convert()
Catch ex As Exception
''do nothing
Finally
doc.Close()
word.Quit()
End Try
End Sub`
It lets me select a path then find all doc files within the subfolders. That code isn't important, all the files for conversion are in lstFiles. Only problem at the moment is that it takes a really long time to convert even just 10 documents. Should I be using one word application per document instead of reusing it? Any suggestions?
Also it opens word after about 2 or 3 conversions and starts flashing but keeps converting.
EDIT3: Tweaked to code above a little bit and it runs cleaner. Takes 1min10sec to convert 8 files though. Considering I have 14000 I need to convert this method will take a reasonably long time.
EDIT4: Changed the code up again. Uses a threadpool now. Seems to run a bit faster. Still need to run on a better computer to convert all the documents. Or do them slowly by folder. Can anyone think of any other way to optimize this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我在本地运行了您的代码,仅进行了一些小的编辑以改进跟踪和计时,并且“仅”花费了 13.73 秒来处理 12 个文件。这将在大约 4 小时内处理完您的 14,000 人。我在具有双核处理器的 Windows 7 x64 上运行 Visual Studio 2010。也许您可以使用更快的计算机?
这是我的完整代码,这只是一个带有单个按钮 Button1 和一个FolderBrowserDialogFolderBrowserDialog1 的表单:
I ran your code locally, with just some minor edits for improved tracing and timing, and it "only" took 13.73 seconds to do 12 files. That would take care of your 14,000 in about 4 hours. I'm running Visual Studio 2010 on Windows 7 x64 with a dual core processor. Perhaps you can just use a faster computer?
Here's my full code, this is just a form with a single button, Button1, and a FolderBrowserDialog, FolderBrowserDialog1:
使用文字自动化并打开它并使用 wdFormatDocumentDefault 的 WdSaveFormat 枚举保存它,该枚举应该是 docx
http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdsaveformat%28v=office.14%29.aspx
或尝试你提到的转换方法。不管怎样,100% 可能,而且应该相当容易。
编辑:如果丹尼尔发布的转换器有效,那就容易多了,他值得所有的荣誉:)
Use word automation and open it and save it with the WdSaveFormat enumeration for wdFormatDocumentDefault which should be docx
http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdsaveformat%28v=office.14%29.aspx
or try your hand at the Convert method you mentioned. Either way 100% possible and should be fairly easy.
Edit: if the converter Daniel posted works, thats far easier and he deserves all the credit : )
您可以使用免费的 Office 文件转换器。
这里解释了设置:
http://technet.microsoft.com/en-us/ Library/cc179019.aspx
有一个文件列表设置。
You can use the free Office File Converter.
Here explains the settings:
http://technet.microsoft.com/en-us/library/cc179019.aspx
There is a file list setting.
试试这个:
try this: