自动化.doc到.htm word中的过程
问题
我们从另一家公司继承了一个较旧的项目,该项目的“帮助”索引由由.doc文件转换的HTM文件组成。问题是,他们的团队以非常过时的和不支持编码的所有这些文件导出了所有这些文件,因此他们包装了随机的特殊字符Alts。
最终,我们将更易于使用和开发一个系统替换该系统,但是鉴于该产品带有大型用户群,与此同时,我们需要解决此问题。是否有一些自动化工具为此(现在仍然可以使用,我尝试了几个较旧的VB脚本),还是我今天需要手动重新删除几百个文档? (这不一定是一个巨大的问题,但是我认为我的时间会更好地花在今天的工作上)
非常清楚:我有一个装满.doc的文件夹需要重新保存为带有UTF编码的.htm文件的文件
我已经尝试了
我一直在挖掘的文件解决方案。我当前的代码如下:
Sub ChangeDocsToTxtOrRTFOrHTML()
Dim locFolder As String
Dim fileType As String
Dim oFolder As Object
Dim tFolder As Object
Dim fs As Object
locFolder = "C:\Users\ColeD\Desktop\Help Files Angular"
fileType = ".htm"
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
MsgBox ("hrtr!")
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
strDocName = strDocName & fileType
ChangeFileOpenDirectory tFolder
ActiveDocument.SaveAs2 FileName:=strDocName & fileType, _
FileFormat:=wdFormatHTML, _
Encoding:=msoEncodingUTF8
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
MsgBox ("Done!")
End Sub
问题是,它仅打开一个文件然后停止
Question
We inherited an older project from another company, and this project has a "help" index made up of htm files that were converted from .doc files. The issue is, their team exported all of these files in a very outdated and not supported encoding so they are packed with random special character alts.
Eventually we will replace this system with a MUCH easier to use and develop one, but given that the product came with a large userbase, in the meantime we need to fix this. Is there some automation tool for this (that still works in present day, I've tried a couple older vb scripts), or am I going to need to manually re-export a few hundred docs today? (its not necessarily a huge issue, but there are other things that I think my time would be better spent on working on today)
To be very clear: I have a folder full of .doc files that need to be re-saved as .htm files with UTF-encoding
What I've tried
I've been digging through several SO posts trying various solutions. My current code is as follows:
Sub ChangeDocsToTxtOrRTFOrHTML()
Dim locFolder As String
Dim fileType As String
Dim oFolder As Object
Dim tFolder As Object
Dim fs As Object
locFolder = "C:\Users\ColeD\Desktop\Help Files Angular"
fileType = ".htm"
Set fs = CreateObject("Scripting.FileSystemObject")
Set oFolder = fs.GetFolder(locFolder)
Set tFolder = fs.GetFolder(locFolder & "Converted")
For Each oFile In oFolder.Files
MsgBox ("hrtr!")
Dim d As Document
Set d = Application.Documents.Open(oFile.Path)
strDocName = ActiveDocument.Name
intPos = InStrRev(strDocName, ".")
strDocName = Left(strDocName, intPos - 1)
strDocName = strDocName & fileType
ChangeFileOpenDirectory tFolder
ActiveDocument.SaveAs2 FileName:=strDocName & fileType, _
FileFormat:=wdFormatHTML, _
Encoding:=msoEncodingUTF8
d.Close
ChangeFileOpenDirectory oFolder
Next oFile
MsgBox ("Done!")
End Sub
The issue is, it only opens one file then stops
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
看起来您正在使用从将多个Word文档转换为使用VBA HTML文件,
但是您需要使用代码来使其在您的方案中起作用,该方案仅是HTML,而不是其他文件类型。请参见下文示例,以关注DOCX到HTML。
It looks like you are using code copied from Convert multiple Word documents to HTML files using VBA
But you need to work with the code to make it work in your scenario which is only HTML, not the other file types. See below example for focusing on docx to HTML.