适用于 .NET 的 HTML 到 RTF 转换器

发布于 2024-08-17 15:14:47 字数 1539 浏览 8 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

半边脸i 2024-08-24 15:14:47

其价值如何,排名不分先后。

不久前,我想导出到 RTF,然后从 RTF 导入由 MS Word 操作的相关 RTF。

第一个问题是 RTF 不是一个开放标准。它是 MS 的内部标准,因此他们可以随时更改它,并且通常不担心兼容性。目前RTF的版本有1.3到1.9,各有不同。在内部,他们使用缇进行测量只是为了更好的测量。

我买了关于这个主题的 O'Reilly 袖珍书,它帮助我阅读了很多 MS 文档,这些文档很好,但是每个版本都有很多文档。

因为使用正则表达式进行操作的 RTF 编码方式是非常艰苦的工作,需要仔细处理和集中精力来测试和开始工作。我使用内置正则表达式的 Mac 编辑器,这样我就可以稳定地测试每个部分并将其构建到代码中。

由于版本的数量,版本之间也存在很多不兼容性,但有很多共性,最终相当困难/容易地达到我想要的地方(经过大约一周的阅读和一周的编码)并生成一个非常简单的版本。

我从未找到商业解决方案,但由于预算原因,我必须免费使用,这样就省去了很多钱,但在选择一个解决方案时要非常小心,以确保它能满足您的需求并获得支持。

我不认为你来自 HTML/XML/XHTML,我正在转换 CSV 格式,它是 RTF。

我不确定是否建议DIY或购买。总的来说,可能是 DIY,但您自己的情况决定了这一点。

编辑:从内容到 RTF 的转换比反之亦然更容易。

顺便说一句,我并没有批评 MS 的 RTF 版本,嘿,这是他们的,是专有的,所以他们可以做他们喜欢做的事。

For what its worth and in no particular order.

A while ago i wanted to export to RTF and then import from RTF the RTF in question being manipulated by MS Word.

The first problem is RTF is not an open standard. It is an internal MS standard and there fore they alter it as and when they like and do not generally worry about compatibility. Currently the versions of RTF are 1.3 to 1.9 and they are all different. Internally they use twips for measurement just for good measure.

I bought the O'Reilly pocket book on the subject which helped and read a lot of the MS documentation which is good, but there is a lot of it and lots for each version.

Because of the way RTF is coded using regex to manipulate is incredibly hard work and needs careful handling and concentration to test and get to work. I use a Mac editor that had built in regex so i could steadily test each section and build it into the code.

Because of the number of versions there is also a lot of incompatibility between versions but there is a lot of commonality and in the end it was reasonably hard/easy to get where i wanted (after about a weeks reading and a weeks coding) and producing a really simple version.

I never found a commercial solution but i had to have a free on because of budget so that cut a lot out but take great care in choosing one to make sure it does what you want and has support.

I don't think where you are coming from HTML/XML/XHTML, i was converting CSV formats, it the RTF.

I am not sure if i would advise to DIY or buy. Probably on balance DIY but your own circumstances will dictate that.

Edit: One thing going from content to RTF is easier than vice versa.

BTW not criticising MS fior the RTF versions, hey it's theirs and proprietary so they can do what they like.

厌味 2024-08-24 15:14:47

我建议您自己完成,因为任务并不那么复杂。首先,将一种 Xml 格式转换为另一种 Xml 格式的最简单方法是使用 Xslt。在 C# 中转换 Xml 文档非常简单。

这是一篇很好的 msdn 博客文章,可以帮助您入门。迈克甚至提到,手动操作比与第三方打交道更容易。

链接

其实我已经回答过这个问题此处< /a>.猜猜这使得这个重复。

I would recommend doing it yourself as the task is not really that complex. Firstly, the easiest way convert one Xml format into another Xml format is with an Xslt. Converting Xml documents in C# is super easy.

Here is a good msdn blog post to get you started. Mike even mentions that it was easier to do this by hand that to deal with a third party.

link

Actually, I already answered this question here. Guess that makes this a duplicate.

青衫负雪 2024-08-24 15:14:47

我刚刚遇到了这个 WYSIWYG 富文本编辑器 (RTE),它还具有 HTML 到 RTF 转换器,Cute Editor for . NET。有人有这个组件的经验吗?我对基于网络的 RTE 的主要经验是 CKEditor (fckEditor) 和 TinyMCE,但据我所知,CKEditor 和 TinyMCE 没有内置 HTML 到 RTF 转换器。

I just came across this WYSIWYG rich text editor (RTE) for the web that also has an HTML to RTF converter, Cute Editor for .NET. Does anyone have any experience with this component? My main experience for web based RTEs have been CKEditor (fckEditor) and TinyMCE but as far as I can tell CKEditor and TinyMCE do not have HTML to RTF converters built in.

莫言歌 2024-08-24 15:14:47

由于我需要在 Web 应用程序上实现一些具有富文本格式的邮件合并功能,因此我认为分享我的经验会很好。

基本上,我探索了两种替代方案:

  • 使用 Google Docs API 通过 XSLT 来利用 Google Docs 功能
  • ,如 这篇文章

Google Docs API 运行良好。问题是,当您上传带有分页符的 HTML 文档时,如下所示:

<p style="page-break-before:always;display:none;"/>

并要求 Google 将文档转换为 RTF 格式,您会丢失所有分页符,这不符合我的要求。但是,如果分页符对您来说不是问题,您可以查看此解决方案。

XSLT 解决方案可以工作......有点。

如果您绕过 System.Xml 类直接引用 MSXML3 COM 对象,它就会起作用。否则我无法让它发挥作用。此外,它似乎尊重除基本格式和标签之外的所有内容,而忽略文本颜色、大小等。但是,它尊重分页符。 :-)

这是我编写的一个快速库,使用 tidy.net 强制 HTML 到 XHTML 的转换。希望有帮助。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace ADDS.Mailmerge
{

    public class XHTML2RTF
    {

        MSXML2.FreeThreadedDOMDocument _xslDoc;
        MSXML2.FreeThreadedDOMDocument _xmlDoc;
        MSXML2.IXSLProcessor _xslProcessor;
        MSXML2.XSLTemplate _xslTemplate;
        static XHTML2RTF instance = null;
        static readonly object padlock = new object();

        XHTML2RTF()
        {
            _xslDoc = new MSXML2.FreeThreadedDOMDocument();
            //XSLData.xhtml2rtf is a resource file 
            // containing XSL for transformation
            // I got XSL from here: 
            // http://www.codeproject.com/KB/HTML/XHTML2RTF.aspx
            _xslDoc.loadXML(XSLData.xhtml2rtf);
            _xmlDoc = new MSXML2.FreeThreadedDOMDocument();
            _xslTemplate = new MSXML2.XSLTemplate();
            _xslTemplate.stylesheet = _xslDoc;
            _xslProcessor = _xslTemplate.createProcessor();
        }

        public string ConvertToRTF(string xhtmlData)
        {
            try
            {
                string sXhtml = "";
                TidyNet.Tidy tidy = new TidyNet.Tidy();
                tidy.Options.XmlOut = true;
                tidy.Options.Xhtml = true;
                using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(xhtmlData)))
                {
                    StringBuilder sb = new StringBuilder();
                    using (MemoryStream sw = new MemoryStream())
                    {
                        TidyNet.TidyMessageCollection messages = new TidyNet.TidyMessageCollection();
                        tidy.Parse(ms, sw, messages);
                        sXhtml = Encoding.UTF8.GetString(sw.ToArray());
                    }
                }

                _xmlDoc.loadXML(sXhtml);
                _xslProcessor.input = _xmlDoc;
                _xslProcessor.transform();
                return _xslProcessor.output.ToString();
            }
            catch (Exception exc)
            {
                throw new Exception("Error in xhtml conversion. ", exc);
            }
        }

        public static XHTML2RTF Instance
        {
            get
            {
                lock (padlock)
                {
                    if (instance == null)
                    {
                        instance = new XHTML2RTF();
                    }
                    return instance;
                }
            }
        }
    }



}

Since I'm required to implement some mailmerge capabilities with rich-text formatting on a Web application, I thought it'd be nice to share my experiences.

Basically, I explored two alternatives:

  • using Google Docs API to leverage Google Docs capabilities
  • using XSLT, as shown on this essay

Google Docs API works well. Problem is, when you upload an HTML document with page breaks, like this:

<p style="page-break-before:always;display:none;"/>

and ask Google to convert the doc in RTF, you lose all breaks, which does not fit my requirements. However, if page breaks aren't an issue for you, you might check this solution out.

The XSLT solution works... sort of.

It works if you reference MSXML3 COM object directly, bypassing System.Xml classes. Otherwise I couldn't make it work. Moreover, it seems to honor all but basic formatting and tags, disregarding text color, size and the like. However, it honors page breaks. :-)

Here's a quick library I wrote, using tidy.net to force HTML to XHTML conversion. Hope it helps.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace ADDS.Mailmerge
{

    public class XHTML2RTF
    {

        MSXML2.FreeThreadedDOMDocument _xslDoc;
        MSXML2.FreeThreadedDOMDocument _xmlDoc;
        MSXML2.IXSLProcessor _xslProcessor;
        MSXML2.XSLTemplate _xslTemplate;
        static XHTML2RTF instance = null;
        static readonly object padlock = new object();

        XHTML2RTF()
        {
            _xslDoc = new MSXML2.FreeThreadedDOMDocument();
            //XSLData.xhtml2rtf is a resource file 
            // containing XSL for transformation
            // I got XSL from here: 
            // http://www.codeproject.com/KB/HTML/XHTML2RTF.aspx
            _xslDoc.loadXML(XSLData.xhtml2rtf);
            _xmlDoc = new MSXML2.FreeThreadedDOMDocument();
            _xslTemplate = new MSXML2.XSLTemplate();
            _xslTemplate.stylesheet = _xslDoc;
            _xslProcessor = _xslTemplate.createProcessor();
        }

        public string ConvertToRTF(string xhtmlData)
        {
            try
            {
                string sXhtml = "";
                TidyNet.Tidy tidy = new TidyNet.Tidy();
                tidy.Options.XmlOut = true;
                tidy.Options.Xhtml = true;
                using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(xhtmlData)))
                {
                    StringBuilder sb = new StringBuilder();
                    using (MemoryStream sw = new MemoryStream())
                    {
                        TidyNet.TidyMessageCollection messages = new TidyNet.TidyMessageCollection();
                        tidy.Parse(ms, sw, messages);
                        sXhtml = Encoding.UTF8.GetString(sw.ToArray());
                    }
                }

                _xmlDoc.loadXML(sXhtml);
                _xslProcessor.input = _xmlDoc;
                _xslProcessor.transform();
                return _xslProcessor.output.ToString();
            }
            catch (Exception exc)
            {
                throw new Exception("Error in xhtml conversion. ", exc);
            }
        }

        public static XHTML2RTF Instance
        {
            get
            {
                lock (padlock)
                {
                    if (instance == null)
                    {
                        instance = new XHTML2RTF();
                    }
                    return instance;
                }
            }
        }
    }



}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文