如何在 Cocoa 中将 RTF 文本转换为 Markdown 语法的纯文本?
我需要能够将 RTF 或 HTML 转换为 Markdown 语法的纯文本,以便上传到我的服务器。我需要在 Cocoa/Obj-C 2.0 中实现这一点。有谁知道该怎么做?
非常感谢 -» 亚历克斯。
周四下午 4:53 编辑
嗯。为了回答 Yuji 的评论,我正在尝试制作一个接受文本的 NSStatusItem Droplet。文本采用什么格式并不重要,但我需要能够将其格式化为纯文本或使用 Markdown 格式化的纯文本。我想因为我不知道我会收到什么样的短信......
I need to be able to convert RTF or HTML to Markdown-syntaxed plain text for uploading to my server. I need to achieve this in Cocoa/Obj-C 2.0. Does anyone know how to do this?
Thanks so much —» Alex.
Edited Thu 4:53 PM
Umm. In answer to Yuji's comment, I'm trying to make an NSStatusItem
droplet that accepts text. It doesn't matter what format the text is in, but I need to be able to format it either as plain text or plain text formatted with Markdown. I guess since I don't know what kind of text I'll be receiving...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
以下是 pandoc 解析和写入的格式:
不幸的是 rtf 不是它解析的格式之一。它是一个Haskell程序,因此如果不安装Haskell Platform,获取它并不方便。从解析后的文档中,它可以编写一种“普通”子 Markdown,或者标准 Markdown,或者它自己的丰富 Markdown,以及一堆其他格式。内部(“本机”)表示比标准 Markdown 规范要求的要丰富得多,因此丢失的信息会更少,并且您将能够恢复 Markdown 的 html ——或者通过 Latex 等制作 pdf。相当容易为了特殊目的而破解它。
我不知道它们中的任何一个是否稳定,但是来自其他语言的 Pandoc 库的绑定数量越来越多。对 Github 的搜索表明,与 Obj C 连接最相关的寻找是普通 C libpandoc。 Ruby 的活动似乎最多——我猜是因为它是 github——带有 pandoku, pandoc-ruby, rails-pandoc 等等。
Here are the formats pandoc parses and writes:
Unfortunately rtf isn't one of the formats it parses. It is a Haskell program, so it isn't convenient to get it without installing the Haskell Platform. From a parsed document, it can write a sort of 'plain' sub-Markdown, or standard Markdown, or its own enriched Markdown, as well as a pile of other formats. The internal ('native') representation is much richer than the standard Markdown spec requires, so less information will be lost, and you will be able to recover the html for your markdown -- or make a pdf via latex, etc. It is fairly easy to hack at it for special purposes.
I don't know if any of them are stable but there is an increasing number of bindings to the Pandoc libraries from other languages around. A search of Github suggests that the most relevant looking for hooking up with Obj C is the plain C libpandoc. Ruby has the most activity, it seems -- I guess because it's github -- with pandoku, pandoc-ruby, rails-pandoc and so forth.
哦,这会很棘手。正如 Yuji 所说,用 HTML/RTF 可以表达的内容比用 Markdown 表达的内容要多得多。既然如此......
我会将内容转换为
NSAttributedString
。您可以轻松地从 RTF 数据构建NSAttributedString
; HTML 将会困难得多。然而,一旦你这样做了,就需要检查字符串上的所有属性并将等效的降价应用于内容的纯文本版本。进一步研究一下:
Oooph, this is going to be tricky. As Yuji said, you can express a lot more in HTML/RTF than in markdown. That being the case...
I'd convert the content into an
NSAttributedString
. You can easily construct anNSAttributedString
from RTF data; HTML will be much more difficult. Once you do that, however, it'll be a matter of inspecting all the attributes on the string and applying the equivalent markdown to a plaintext version of the content.Researching a bit more:
有一个在线表格可以做到这一点:MarkItDown
There's an online form that does just this: MarkItDown