从 OpenXml 生成 PDF
我正在尝试找到一个可以从 OpenXml 生成 PDF 的 SDK。我使用 Open Xml Power Tools 来转换开放的 XML 和 html,并使用 iTextSharp 将 Html 解析为 PDF。但结果是 PDF 看起来非常糟糕。
我还没有尝试过 iText 的 RTF 解析器。如果我朝这个方向发展,我最终将需要一个 RTF 转换器,使简单的转换成为两步噩梦。
看起来我最终可能会编写一个基于强大工具 OpenXml 到 html 转换器的自定义转换器。任何建议表示赞赏。我现在真的不能选择专业的转换器,因为许可证太贵了(Aspose Word/TxText)。
我想我会更加努力地进行调查。我回到转换实用程序“http://msdn.microsoft.com/en-us/library/ff628051.aspx”并查看其代码。鉴于它错过的最重要的事情是读取底层样式并生成样式属性。由于不处理自定义 true type 字体,PDF 看起来好多了。明天进行更多调查。我希望已经做过类似的事情/面临奇怪的问题并且可以提供一些线索。
private static StringDictionary GetStyle(XElement el)
{
IEnumerable jcL = el.Elements(W.jc);
IEnumerable spacingL = el.Elements(W.spacing);
IEnumerable rPL = el.Elements(W.rPr);
StringDictionary sd = new StringDictionary();
if (HasAttribute(jcL, W.val)) sd.Add("text-align", GetAttribute(jcL, W.val));
// run prop exists
if (rPL.Count() > 0)
{
XElement r = rPL.First();
IEnumerable ftL = el.Elements(W.rFonts);
if (r.Element(W.b) != null) sd.Add("font-weight", "bolder");
if (r.Element(W.i) != null) sd.Add("font-style", "italic");
if (r.Element(W.u) != null) sd.Add("text-decoration", "underline");
if (r.Element(W.color) != null && HasAttribute(r.Element(W.color), W.val)) sd.Add("color", "#" + GetAttribute(r.Element(W.color), W.val));
if (r.Element(W.rFonts) != null )
{
//
if(HasAttribute(r.Element(W.rFonts), W.cs)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.cs));
else if (HasAttribute(r.Element(W.rFonts), W.hAnsi)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.hAnsi));
}
if (r.Element(W.sz) != null && HasAttribute(r.Element(W.sz), W.val)) sd.Add("font-size", GetAttribute(r.Element(W.sz), W.val) + "pt");
}
return sd.Keys.Count > 0 ? sd : null;
}
I am trying to find a SDK that can generate PDF from OpenXml. I have used the Open Xml Power Tools to convert the open XML and html and and using iTextSharp to parse the Html to PDF. But the result is a very terrible looking PDF.
I have not yet tried the iText's RTF parser. If I go this direction, I will end up needing a RTF converter making the simple conversion a double step nightmare.
It almost looks like I might end up writing a custom converter based of power tools OpenXml to html converter. Any advise is appreciated. I really at this time can't end up going for a professional converter as the licenses are too expensive (Aspose Word/TxText).
I thought I will put some more effort into my investigation. I went back to the conversion utility "http://msdn.microsoft.com/en-us/library/ff628051.aspx" and looked through its code. Given the biggest thing it missed was reading the underlying styles and generate a style attribute. The PDF looked much better with the limitation of not handling custom true type font. More investigation tomorrow. I am hoping has done something like this/faced weird issues and can shed some light.
private static StringDictionary GetStyle(XElement el)
{
IEnumerable jcL = el.Elements(W.jc);
IEnumerable spacingL = el.Elements(W.spacing);
IEnumerable rPL = el.Elements(W.rPr);
StringDictionary sd = new StringDictionary();
if (HasAttribute(jcL, W.val)) sd.Add("text-align", GetAttribute(jcL, W.val));
// run prop exists
if (rPL.Count() > 0)
{
XElement r = rPL.First();
IEnumerable ftL = el.Elements(W.rFonts);
if (r.Element(W.b) != null) sd.Add("font-weight", "bolder");
if (r.Element(W.i) != null) sd.Add("font-style", "italic");
if (r.Element(W.u) != null) sd.Add("text-decoration", "underline");
if (r.Element(W.color) != null && HasAttribute(r.Element(W.color), W.val)) sd.Add("color", "#" + GetAttribute(r.Element(W.color), W.val));
if (r.Element(W.rFonts) != null )
{
//
if(HasAttribute(r.Element(W.rFonts), W.cs)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.cs));
else if (HasAttribute(r.Element(W.rFonts), W.hAnsi)) sd.Add("font-family", GetAttribute(r.Element(W.rFonts), W.hAnsi));
}
if (r.Element(W.sz) != null && HasAttribute(r.Element(W.sz), W.val)) sd.Add("font-size", GetAttribute(r.Element(W.sz), W.val) + "pt");
}
return sd.Keys.Count > 0 ? sd : null;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不知道有任何带有源代码的直接转换器,但是,是的,我的想法是您可能需要从头开始构建一个转换器。幸运的是(我猜),Word 的 WordprocessingML 是最简单的 Open XML 格式,您可以从其他项目中寻找灵感,例如:
类似于上面的 TextGlow)
对于商业和服务器端解决方案,您可以使用 Word Automations Services(需要 SharePoint ) 或 Apose.NET词。
I don't know of any direct converter with source-code availabe, but yeah, my thought is that you may need to build a converter from scratch. Luckily (I guess), Word's WordprocessingML is the simplest of the Open XML formats and you can look to other projects for inspiration, such as:
similar to TextGlow above)
For commercial & server-side solutions, you can use either Word Automations Services (requires SharePoint) or Apose.NET Words.