使用自定义标记/结构化语言与 XML 的设计注意事项是什么
我想要一个文本界面来存储一些想要放入 mySQL 表中的结构化数据。目前它是使用以下符号的文本形式。
我试图理解为什么使用 XML - 基本上我的字段将在 XML 标签中,而不是使用“自定义标记/结构”/**/、- 和 |来表示表和字段。
我有代码可以将其放入 mySQL 中并提取它。我只是觉得使用这个符号有点像黑客。稍后,结构化数据文件将用于导入和导出数据,有点像导出书签时的 Internet Explorer。
/*Table*/
-
Field 1 | Field 2 | Field 3
-
Field 1 | Field 2 | Field 3
使用自定义标记语言与 XML 的设计注意事项是什么?
I want a textual interface for some structured data that I want to put into a mySQL table. Currently it is in text from using the notation below.
I'm trying to understand why XML is used - basically where my fields would be in XML tags instead of using "custom markup/structure" /**/, -, and | to denote tables and fields.
I have code that will put this into mySQL and extract it. I just feel a bit like a hack for using this notation. Later the structured data file will be used for importing and exporting data, kind of like Internet explorer when you export your bookmarks.
/*Table*/
-
Field 1 | Field 2 | Field 3
-
Field 1 | Field 2 | Field 3
What are the design considerations for using a custom markup language vs XML?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您应该使用 XML,因为:
You should use XML because :
为什么要自己发明?有十多种轻量级标记语言。
编辑:@Luc M 的回答非常好。一般来说,您(几乎)总是希望使用现有的解析器(如果有的话)。为什么要重新发明轮子?如果您想要简单的格式,请使用 CSV、YAML,或 JSON 。但 XML 并没有什么问题,而且有很多很多可靠的解析器可供使用。大多数雇主关心的是快速且廉价地获得高质量的软件,而编写解析器很少有助于实现这一目标。
Why invent your own? There are over a dozen lightweight markup languages.
EDIT: @Luc M's answer is very good. In general, you (almost) always want to use an existing parser if one is available. Why reinvent the wheel? If you want a simple format, go with CSV, YAML, or JSON. But there's nothing wrong with XML, and there are lots and lots of solid parsers available for it. Most employers care about getting quality software quickly and cheaply, and writing parsers seldom helps that cause.
有哪些考虑因素?
通过自己动手的解决方案,您将获得的有利的东西:
解析时间:这只是您可能获得的东西。在读取数据方面,您很难击败像 RapidXML 这样的优化解析器。但是,您的解析器将能够直接解析您的数据结构,而使用基于轻量级语言的解决方案,您必须遍历它发出的数据结构以生成真实数据。
请注意,预制解决方案仍然有可能击败您的解决方案,只是因为编写优化的解析器很困难。尽管 Boost.Spirit 总能帮助你。
这确实是我能想到的 DIY 解决方案的所有优点。如果这是您要从用户那里获得的数据,那么使用自制解决方案进行错误报告可能会有优势。但你谈论的是你将生成和使用的数据;不需要手动编辑,因此错误报告不会成为一个重大问题。
您从 XML 或其他轻量级语言解决方案获得的东西几乎都被其他解决方案涵盖了。
What are the considerations?
The favorable things you will get with a do-it-yourself solution:
Parse time: This is only potentially something that you'll get. It's going to be hard for you to beat an optimized parser like RapidXML for reading data. However, your parser will be able to parse directly into your data structures, whereas with an lightweight language-based solution, you must walk the data structure it emits to generate your real data.
Note that it is still possible that a pre-made solution will beat yours, simply because writing an optimized parser is hard. Though there's always Boost.Spirit to help you.
That's really all I can think of for advantages for a do-it-yourself solution. If this were data that you were going to get from the user, there could be advantages in error reporting with a self-made solution. But you're talking about data that you will both generate and consume; there is no expectation of hand editing, so error reporting isn't going to be a significant concern.
The things you get from an XML or other lightweight language solution are pretty much covered by the others.
3 个原因:
(a) XML 规范经过仔细编写,对于什么是允许的、什么是不允许的没有任何含糊之处。本土规范从来没有那么彻底(相信我,我已经看过数百个规范),因此您将永远争论特定消息是否有效。
(b) 有多种一致且高性能的 XML 解析器可供选择 - 您永远不必担心编写和测试您自己的解析器。 (根据我的经验,本土语言的解析器通常在投入生产之前对大约 5 条测试消息进行测试,结果不可避免。)
(c) 围绕 XML 有一个完整的生态系统 - 创作工具、验证器、编程语言 API、安全性,规范化,你能想到的;加上使这一切顺利进行的技能和知识。
尽管如此,对于非常简单的数据,可能还有其他同样有效的格式,例如 Java 属性文件。但我会避开 CSV - 有无数种不同的风格,但没有一个被正确指定。
3 reasons:
(a) the XML spec has been carefully written, there are no ambiguities about what is and isn't allowed. Home-grown specs are never as thorough (I've seen hundreds of them, believe me) so you will forever be arguing about whether a particular message is valid or not.
(b) there's a wide choice of conformant and performant XML parsers around - you will never have to worry about writing and testing your own parser. (Parsers for home-grown languages, in my experience, are usually tested on about 5 test messages before going into production, with inevitable consequences.)
(c) there's a whole ecosystem around XML - authoring tools, validators, programming language APIs, security, canonicalization, you name it; plus the skills and knowledge to make it all work.
Having said that, for very simple data there may be other formats that work equally well, for example Java property files. But I would steer clear of CSV - there are a zillion different flavours and none of them are properly specified.