如何从 MS Word 中的行号获取文本
是否可以使用办公自动化从 MS Word 中的给定行号获取文本(行或句子)?我的意思是,如果我可以获得给定行号中的文本或作为该行一部分的句子本身,那就可以了。
我没有提供任何代码,因为我完全不知道如何使用办公自动化阅读 MS Word。我可以像这样打开文件:
var wordApp = new ApplicationClass();
wordApp.Visible = false;
object file = path;
object misValue= Type.Missing;
Word.Document doc = wordApp.Documents.Open(ref file, ref misValue, ref misValue,
ref misValue, ref misValue, ref misValue,
ref misValue, ref misValue, ref misValue,
ref misValue, ref misValue, ref misValue);
//and rest of the code given I have a line number = 3 ?
编辑:为了澄清@Richard Marskell - Drackir 的疑问,虽然MS Word 中的文本是一长串字符串,但办公自动化仍然让我们知道行号。事实上,我从另一段代码中获取行号本身,如下所示:
Word.Revision rev = //SomeRevision
object lineNo = rev.Range.get_Information(Word.WdInformation.wdFirstCharacterLineNumber);
例如,Word 文件如下所示:
fix grammatical or spelling errors
clarify meaning without changing it correct minor mistakes add related resources or links
always respect the original author
这里有 4 行。
Is it possible to get text (line or sentence) from a given line number in MS Word using office automation? I mean its ok if I can get either the text in the given line number or the sentence(s) itself which is a part of that line.
I am not providing any code because I have absolutely no clue how an MS Word is read using office automation. I can go about opening the file like this:
var wordApp = new ApplicationClass();
wordApp.Visible = false;
object file = path;
object misValue= Type.Missing;
Word.Document doc = wordApp.Documents.Open(ref file, ref misValue, ref misValue,
ref misValue, ref misValue, ref misValue,
ref misValue, ref misValue, ref misValue,
ref misValue, ref misValue, ref misValue);
//and rest of the code given I have a line number = 3 ?
Edit: To clarify @Richard Marskell - Drackir's doubt, though text in MS Word is a long chain of string, office automation does still let us know line number. In fact I get the line number itself from another piece of code, like this:
Word.Revision rev = //SomeRevision
object lineNo = rev.Range.get_Information(Word.WdInformation.wdFirstCharacterLineNumber);
For instance say the Word file looks like this:
fix grammatical or spelling errors
clarify meaning without changing it correct minor mistakes add related resources or links
always respect the original author
Here there are 4 lines.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
幸运的是,经过一些史诗般的搜索,我找到了解决方案。
此处是代码背后的天才。请点击链接获取有关其工作原理的更多说明。
Fortunately after some epic searching I got a solution.
Here's the genius behind the code. Follow the link for some more explanation on how it works.
如果您想读取标准文本 .txt 文件,请使用此选项
您可以使用这里的一个调用来读取文件,
如果您想循环并查看返回的项目使用类似这样的内容
,或者
我完全忘记了一些Word文档可能是二进制格式的,那么 所以看看这个和将内容读入 RichTextBox,从那里您可以获取所需的行号,也可以将其加载到单词之后的列表中。此链接将向您显示
从 Word 文档中读取
如果你想阅读文档一词的 XML 格式:
这里还有一个很好的结账链接
Word 文档的 ReadXML 格式
这是一个更简单的示例,将内容读取到剪贴板中
将 Word 加载到剪贴板
Use this if you want to read standard text .txt files
Here is something that you can use to read the files with one call
if you want to loop thru and see what the items that were returned use something like this
or
I totally forgot about something Word docs are probably in binary format so look at this and read the contents into a RichTextBox and from there you could either get at the line number you want or load it into a list after words.. this link will show you
Reading from a Word Doc
if you want to read the XML Formatting of the word Document:
here is a good link as to checkout as well
ReadXML Format of a Word Document
This onne is an even easier example reads contents into the ClipBoard
Load Word into ClipBoard