构建文档格式的解释器

发布于 2024-08-10 02:41:25 字数 324 浏览 4 评论 0原文

我要开始开发自己的文档格式(如 PDF、XPS、DOC、RTF...),但我想知道在哪里可以阅读一些教程、操作方法...?我不需要代码,这是一个我想学习如何构建它的项目,而不是使用其他人的经验。

PS:我想让它像一个 XML 文件:

[Command Argument="Define it" Argument2="Something"]

它就像 PDF,但是这个语法将由一个程序解释,我将使用 C# 构建它,就像 HTML 和你的浏览器一样;)

请记住,我的问题是关于该程序的将解释此代码,但最好从解释 XML 代码的教程开始;)

I'm going to start the development of my own document format(like PDF, XPS, DOC, RTF...), but I want to know where I can read some tutorials, how-to's...? I don't want code, this is a project that I want to learn how to build it, not use the experience of someone other.

PS: I want to make it like a XML file:

[Command Argument="Define it" Argument2="Something"]

It's like PDF, but this syntax will be interpreted by a program that I will build it using C#, just like HTML and your browser ;)

Remember that my question is about the program that will interpret this code, but it's good to start with a tutorial of interpreting XML code ;)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

梦里人 2024-08-17 02:41:25

我认为您这样做是为了学习如何做到这一点。如果是这样的话,这是一次值得的冒险,我理解。

您需要从学习 LL 解析器和语法开始。这将帮助您将从文件中读取的文档解释为文档对象模型 (DOM)。从那里您可以创建例程来操作或呈现该文档树。

祝你好运!

I assume you're doing this for the sake of learning how to do it. If that's the case, it is a worthwhile venture and I understand.

You'll want to start out by learning LL parsers and grammars. That will help you interpret the document that has been read from a file into a document object model (DOM). From there you can create routines to manipulate or render that document tree.

Good luck!

迷爱 2024-08-17 02:41:25

我对您的问题感到困惑,但如果您需要自己的格式,例如 XML 文件,为什么不直接使用 XML 来描述该格式呢?

编辑:好的,我想我现在明白了。如果您这样做是为了乐趣和学习(这很棒),那么可以采取很多方法。事实上,最好不做任何研究,尝试自己提出解决方案,看看它是否有效,您需要做什么才能让它变得更好等等。

I'm confused as to what you're asking, but if you need your own format like an XML file, why not just use XML to describe the format?

Edit: Okay, I think I understand now. If you're doing this for fun and for learning (which is great), then there are lots of approaches to take. In fact, it may even be better to not do any research, try to come up with a solution on your own and see if it works, what you need to do to make it better, etc.

心头的小情儿 2024-08-17 02:41:25

听起来像是一个很好的学习项目,并且您已经在这里得到了一些很好的指导。我想补充一点,您应该记住文档文件语言和文档格式之间存在差异。

考虑 OOXML,它是一种构建在 XML 之上的文档格式(我希望描述为文件语言)。如果您的目的是了解如何构建自己的文档格式,那么我强烈建议您从 XML 开始,这样您就不必重新发明语言解析器。这将使您专注于构建格式的问题。

也就是说,如果您想尝试创建自己的语言,那就太好了;只是想确保您意识到它们是不同的野兽。

以下是一些可帮助您开始在 C# 中使用 XML 的链接:

Sounds like a good learning project and you've got some good pointers here already. I would just add that you should remember that there is a difference between a document file language and a document format.

Consider OOXML, it is a document format that is built on top of XML (what I'd describe as the file language). If your purpose is to learn about building your own document format then I'd highly recommend starting with XML so that you don't have to reinvent a language parser. This will let you focus on the concerns around building the format.

That said, good on you if you want to play around with creating your own language; just wanted to make sure you realized that they are different beasts.

Here are some links that will help you get started using XML in C#:

镜花水月 2024-08-17 02:41:25

我绝不会禁止你为了学习新东西而重新发明轮子。尝试一下这个对你有好处。但是,如果您要询问如何执行此操作的问题,则需要进一步具体说明您的问题。
您是否正在寻求以下方面的帮助:

  • 设计框架/格式
  • 规划时间/估计截止日期
  • 使用 XML 使用
  • C#
  • 构建基于 Web 的 C# 应用程序
  • 构建基于 PC 的 C# 应用程序
  • 开发的其他方面

这里有很多人想要提供帮助——但最好的答案是针对重点问题(不一定具体,但总是重点突出。)

Far be it from me to forbid you from re-inventing the wheel for the sake of learning something new. Good for you for trying this out. However, if you are going to ask questions about how to do it you are going to need to specify your questions a little more.
Are you looking for help on:

  • Designing your framework / format
  • Planning your time / Estimating deadlines
  • Working with XML
  • Working with C#
  • Building a web-based C# application
  • Building a PC-based C# application
  • Other aspects of development entirely

There are many people here who want to help -- but the best answers are given to focused questions (not necessarily specific, but always focused.)

鼻尖触碰 2024-08-17 02:41:25

有几种方法可以解决这个问题。一种方法是首先定义文件的格式,然后使用解析器生成器来创建可以读取该格式的 C# 代码。在 Google 上搜索“c# 解析器生成器”将为您提供指向许多可以使用的不同库的链接。

或者,您可以从头开始编写自己的解析器。这比使用解析器生成工具需要更多工作,但最终可能更具教育意义。

对于简单的格式来说,定义语法方法可能完全是多余的。解决该问题的另一种方法是首先设计将在应用程序中使用的对象树,然后编写序列化和反序列化例程以保存和加载文件中的内容。 C# 中的序列化接口非常灵活,您可以轻松序列化为二进制或 XML 文件。

我认为创建自己的序列化器来创建您喜欢的格式的文件应该相对简单,但 MSDN 今天不是我的朋友,所以我找不到相关文档。

There are a couple of ways to approach this. One way would be to define the format of the file first, then use a parser-generator to crate C# code that can read that format. doing a Google search on "c# parser generator" will get you links to a number of different libraries you can use.

Alternatively, you could code your own parser, from scratch. This will be more work than using a parser generation tool, but might be more educational in the end.

The define-a-grammar approach may be total overkill for a simple format. Another way to approach the problem is to design the object tree that you'll use in-app first, then write serialization and de-serialization routines to save and load the contents from a file. The serialization interface in C# is pretty flexible, and you can serialize to binary or XML files easily.

I think it should be relatively straightforward to create your own serializer to create a file formatted however you like, but MSDN is not being my friend today, so I can't find the relevant documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文