轻量级富文本 XML 格式?

发布于 2024-07-04 15:12:35 字数 558 浏览 9 评论 0原文

我正在编写一个基本的文字处理应用程序,并试图确定一种本机“内部”格式,即我的代码解析以呈现到屏幕上的格式。 我希望它是 XML,这样我将来就可以编写 XSLT 将其转换为 ODF 或 XHTML 或其他格式。

在寻找可用的现有标准时,唯一看起来有前途的标准是 ODF。 但这对于我的需要来说似乎是巨大的矫枉过正。 我所需要的只是段落标签、字体选择、字体大小和字体。 装饰……差不多就这些了。 即使实现一个最小的 ODF 渲染器,我也需要很长时间,而且我不确定是否值得这么麻烦。

现在我正在考虑创建自己的 XML 格式,但这并不是一个很好的做法。 最好使用标准,特别是从那时起我可能可以找到我将来可能需要的 XSLT 已经编写好了。

或者我应该硬着头皮实施ODF?

编辑:关于答案

我以前就知道 XSL-FO,但由于规范的重要性并没有真正考虑它。 但你是对的,一个子集将为我提供工作所需的一切和​​成长的空间。 非常感谢提醒。

另外,通过包含 FOP 或 RenderX 等渲染库,我可以免费生成 PDF。 不错...

I am writing a basic word processing application and am trying to settle on a native "internal" format, the one that my code parses in order to render to the screen. I'd like this to be XML so that I can, in the future, just write XSLT to convert it to ODF or XHTML or whatever.

When searching for existing standards to use, the only one that looks promising is ODF. But that looks like massive overkill for what I need. All I need is paragraph tags, font selection, font size & decoration...that's pretty much it. It would take me a long time to implement even a minimal ODF renderer, and I'm not sure it's worth the trouble.

Right now I'm thinking of making my own XML format, but that's not really good practice. Better to use a standard, especially since then I can probably find the XSLTs I might need in the future already written.

Or should I just bite the bullet and implement ODF?

EDIT: Regarding the Answer

I knew about XSL-FO before, but due to the weight of the spec hadn't really consdiered it. But you're right, a subset would give me everything I need to work with and room to grow. Thanks so much the reminder.

Plus, by including a rendering library like FOP or RenderX, I get PDF generation for free. Not bad...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

若有似无的小暗淡 2024-07-11 15:12:35

由于您确定需要代表事物的表现方面,因此可能值得查看 XSL-FO W3C 推荐标准。 这是一种成熟的页面描述语言,也是(非常不流行的)更知名的 XSLT 的另一半。

显然,整个事情绝不是“轻量级”的,但如果你只是合并一个
非常有限的子集 - 甚至可能只是(以匹配您的“段落标签、字体选择、字体大小和装饰”规范)fo:block通用字体属性,例如:

<yourcontainer xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <fo:block font-family="Arial, sans-serif" font-weight="bold"
        font-size="16pt">Example Heading</fo:block>
    <fo:block font-family="Times, serif"
        font-size="12pt">Paragraph text here etc etc...</fo:block>
</yourcontainer>

这可能比自己滚动有一些优势。 有一个开放的规范可供使用,这一切都意味着。 它重用 CSS 属性作为 XML 属性(与 SVG 类似),因此许多格式细节看起来有些熟悉。 如果您后来决定智能分页是一项必备功能,那么您将有一条升级路径 - 包括规范的更多部分,因为它们与您的应用程序相关。

通过研究 XSL-FO,您可能会得到另一件事 - 看到即使只是做段落和字体也可能变得极其复杂。 尝试为各种不同的语言和用例进行文本布局和换行“正确的方式”对我来说似乎非常令人畏惧。

As you are sure about needing to represent the presentational side of things, it may be worth looking at the XSL-FO W3C Recommendation. This is a full-blown page description language and the (deeply unfashionable) other half of the better-known XSLT.

Clearly the whole thing is anything but "lightwight", but if you just incorporated a
very limited subset - which could even just be (to match your spec of "paragraph tags, font selection, font size & decoration") fo:block and the common font properties, something like:

<yourcontainer xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <fo:block font-family="Arial, sans-serif" font-weight="bold"
        font-size="16pt">Example Heading</fo:block>
    <fo:block font-family="Times, serif"
        font-size="12pt">Paragraph text here etc etc...</fo:block>
</yourcontainer>

This would perhaps have a few advantages over just rolling your own. There's an open specification to work from, and all that implies. It reuses CSS properties as XML attributes (in a similar manner to SVG), so many of the formatting details will seem somewhat familiar. You'd have an upgrade path if you later decided that, say, intelligent paging was a must-have feature - including more sections of the spec as they become relevant to your application.

There's one other thing you might get from investigating XSL-FO - seeing how even just-doing-paragraphs-and-fonts can be horrendously complicated. Trying to do text layout and line breaking 'The Right Way' for various different languages and use cases seems very daunting to me.

浮萍、无处依 2024-07-11 15:12:35

XML 是一种外部格式,而不是内部格式。

XHTML 有什么问题吗? 它很简单而且无处不在(至少 HTML 是这样)。 您的实现将很容易调试,并且您的用户将永远感激不已。

XML is an external format, not internal.

What's wrong with XHTML? It's simple and it's ubiquitous (at least HTML is). Your implementation would be easy to debug, and your users will be eternally greatful.

穿越时光隧道 2024-07-11 15:12:35

好吧,对...但是既然我无论如何都需要能够转换为 XML,那么当没有什么可以阻止我直接使用 DOM 树工作时,为什么还要将我的文档树和 DOM 树都保存在内存中呢?

特别是因为我的程序的一个独特功能是,所有内容都会在您键入时保存,并且我不想每次按下按键时都运行到 XML 的完整转换。 将输入和输出直接绑定到内存中的 DOM 树会更容易。

编辑:
哦,XHTML 的唯一问题是我确实想支持基本分页。 虽然我想没有什么可以阻止我为此使用一些额外的标签......

Well, right... But since I need to be able to convert to XML anyway, why hold both my document tree and the DOM tree in memory, when there's nothing preventing me from working right off the DOM tree?

Particularly since one unique feature of my program is that everything is always saved as you type, and I don't want to run a whole conversion to XML every time I hit a key. Easier just to tie input and output directly to my in-memory DOM tree.

Edit:
Oh, and the only problem with XHTML is that I do want to support basic pagination. Though I guess there's nothing stopping me with using some additional tags for that...

淤浪 2024-07-11 15:12:35

我喜欢 DocBook,但它不太适合。 它力求独立于表示,其目的是您可以使用 XSLT 将其呈现为表示格式。

在文字处理器中,用户正在编辑演示文稿和内容。 例如,用户不想标记“关键字”,他们必然希望将某些文本设置为粗体。

DocBook 编辑器将是一件非常好的事情(我不确定是否存在好的编辑器),但这并不是我真正正在做的事情。

I like DocBook, but it doesn't really fit. It strives to be presentation-independent, the intention being that you would use XSLT to render it to a presentation format.

In a word processor, the user is editing presentation along with the content. For example, the user doesn't want to mark a "keyword", necessarily, they want to make some text bold.

A DocBook editor would be a very nice thing (I'm not sure a good one exists), but it's not really what I'm doing.

假面具 2024-07-11 15:12:35

如果它仅用于文字处理,那么也许 DocBook 可能比 ODF 轻一点?

然而,维基条目指出:

DocBook 是一种用于技术文档的语义标记语言。 它最初旨在编写与计算机硬件和软件相关的技术文档,但它可以用于任何其他类型的文档。

那么它可能不太适合通用文字处理器?

使用 DocBook 的优点是,许多 DocBook -> 其他格式转换器应该可用吗? 希望这可以帮助。

If its only for word processing, then perhaps DocBook might be a little lighter than ODF?

However, the wiki entry states:

DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation.

So it might not be so suitable for a general-purpose word-processor?

The advantage of using DocBook would be the fact that a number of DocBook -> other format converters should be available? Hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文