使用MVC打开xml替换word文件中的文本并返回内存流
我有一个包含指定模式文本 {pattern} 的 Word 文件,我想用从数据库读取的新字符串替换这些模式。因此,我使用从 docx 模板文件中打开 xml 读取流来替换我的模式字符串,然后返回到支持下载文件而无需创建临时文件的流。但是当我打开它时,它在 docx 文件上生成了错误。下面是我的示例代码,
public ActionResult SearchAndReplace(string FilePath)
{
MemoryStream mem = new MemoryStream(System.IO.File.ReadAllBytes(FilePath));
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(mem, true))
{
string docText = null;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
Regex regexText = new Regex("Hello world!");
docText = regexText.Replace(docText, "Hi Everyone!");
//Instead using this code below to write text back the original file. I write new string back to memory stream and return to a stream download file
//using (StreamWriter sw = new //StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
//{
// sw.Write(docText);
//}
using (StreamWriter sw = new StreamWriter(mem))
{
sw.Write(docText);
}
}
mem.Seek(0, SeekOrigin.Begin);
return File(mem, "application/octet-stream","download.docx"); //Return to download file
}
请建议我任何解决方案,而不是从 Word 文件中读取文本并替换那些预期的模式文本,然后将数据写回原始文件。是否有任何解决方案用 WordprocessingDocument 库替换文本?如何使用验证 docx 文件格式返回内存流?
I have an word file that contain my specified pattern text {pattern} and I want to replace those pattern with new my string which was read from database. So I used open xml read stream from my docx template file the replace my pattern string then returned to stream which support to download file without create a temporary file. But when I opened it generated me error on docx file. Below is my example code
public ActionResult SearchAndReplace(string FilePath)
{
MemoryStream mem = new MemoryStream(System.IO.File.ReadAllBytes(FilePath));
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(mem, true))
{
string docText = null;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
Regex regexText = new Regex("Hello world!");
docText = regexText.Replace(docText, "Hi Everyone!");
//Instead using this code below to write text back the original file. I write new string back to memory stream and return to a stream download file
//using (StreamWriter sw = new //StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
//{
// sw.Write(docText);
//}
using (StreamWriter sw = new StreamWriter(mem))
{
sw.Write(docText);
}
}
mem.Seek(0, SeekOrigin.Begin);
return File(mem, "application/octet-stream","download.docx"); //Return to download file
}
Please suggest me any solutions instead read a text from a word file and replace those expected pattern text then write data back to the original file. Are there any solutions replace text with WordprocessingDocument libary? How can I return to memory stream with validation docx file format?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您所采取的方法是不正确的。如果您正在搜索的模式偶然与某些 Open XML 标记相匹配,则会损坏文档。如果您要搜索的文本被分割多次运行,您的搜索/替换代码将找不到该文本并且无法正确运行。如果您想要搜索并替换 WordprocessingML 文档中的文本,可以使用一个相当简单的算法:
特点。这包括运行
有特殊字符,例如
换行符、回车符或硬符
选项卡。
与字符匹配的一组运行
在您的搜索字符串中。
然后你可以替换那组运行
与新创建的运行(其中有
运行的运行属性
包含第一个字符
与搜索字符串匹配)。
通过新创建的运行,您可以
然后合并相邻的运行
相同的格式。
我写了一篇博文并录制了一个演示该算法的截屏视频。
博客文章: http://openxmldeveloper.org/archive/2011/05/12/ 148357.aspx
屏幕截图:http://www.youtube.com/watch?v=w128hJUu3GM
-埃里克
The approach you are taking is not correct. If, by chance, the pattern you are searching for matches some Open XML markup, you will corrupt the document. If the text you are searching for is split over multiple runs, your search/replace code will not find the text and will not operate correctly. If you want to search and replace text in a WordprocessingML document, there is a fairly easy algorithm that you can use:
character. This includes runs that
have special characters such as a
line break, carriage return, or hard
tab.
set of runs that match the characters
in your search string.
then you can replace that set of runs
with a newly created run (which has
the run properties of the run
containing the first character that
matched the search string).
with a newly created run, you can
then consolidate adjacent runs with
identical formatting.
I've written a blog post and recorded a screen-cast that walks through this algorithm.
Blog post: http://openxmldeveloper.org/archive/2011/05/12/148357.aspx
Screen cast: http://www.youtube.com/watch?v=w128hJUu3GM
-Eric
直接写入Word文档流确实会损坏它。
您应该改为写入
MainDocumentPart
流,但您应该首先截断它。看起来 MainDocumentPart.FeedData(Stream sourceStream) 方法就可以做到这一点。
我还没有测试过,但这应该有效。
Writing directly to the word document stream will indeed corrupt it.
You should instead write to the
MainDocumentPart
stream, but you should first truncate it.It looks like
MainDocumentPart.FeedData(Stream sourceStream)
method will do just that.I haven't tested it but this should work.