XML 的缺点是什么?
阅读 StackOverflow 并收听 Joel Spolsky 和 Jeff Atwood 的播客,我开始相信许多开发人员讨厌使用 XML,或者至少尝试尽可能避免使用 XML 来存储或交换数据。
另一方面,我非常喜欢使用 XML,原因如下:
- XML 序列化在大多数现代语言中实现,并且非常易于使用,
- 比二进制序列化慢,XML 序列化在以下情况下非常有用:当使用来自多种编程语言的相同数据,或者想要由人类读取和理解甚至调试时(例如,JSON 更难理解),
- XML 支持unicode,如果使用得当,不同的编码、字符等都不会出现问题。
- 有很多工具可以轻松处理 XML 数据。 XSLT 就是一个示例,它可以轻松地呈现和转换数据。 XPath是另一种,可以轻松搜索数据,
- XML可以存储在某些SQL服务器中,这使得当数据太复杂而难以轻松存储在SQL表中时< /strong> 必须保存和操作;例如,JSON 或二进制数据不能直接通过 SQL 进行操作(除非通过操作字符串,这在大多数情况下是疯狂的),
- XML 不需要安装任何应用程序。如果我希望我的应用程序使用数据库,我必须首先安装数据库服务器。如果我希望我的应用程序使用 XML,我不必安装任何东西,
- XML 比 Windows 注册表或 INI 文件等更明确且可扩展,
- 在大多数情况下,由于 XML 提供的抽象级别,不存在 CR-LF 问题。
那么,考虑到使用 XML 的所有好处,为什么这么多开发人员讨厌使用它呢?恕我直言,它唯一的问题是:
- XML 太冗长,并且比大多数其他形式的数据需要更多的空间,尤其是在 Base64 编码方面。
当然,有很多场景 XML 根本不适合。将 SO 的问题和答案存储在服务器端的 XML 文件中是绝对错误的。或者,当存储 AVI 视频或一堆 JPG 图像时,XML 是最不好使用的。
但其他场景呢? XML 的缺点是什么?
对于那些认为这个问题不是真正问题的人:
与非封闭的问题相反 自 1980 年以来计算领域的重大新发明,我的问题是一个非常明确的问题,明确要求解释其他人在使用 XML 时遇到的弱点以及他们为什么不喜欢它。例如,它不邀请讨论XML 是好是坏。它也不需要长时间的讨论;因此,到目前为止收到的当前答案简短而准确,并提供了我想要的足够信息。
但是它是一个维基,因为这个问题不可能有一个唯一好的答案。
根据 SO 的说法,“不是真正的问题”是指“很难说出这里问的是什么。这个问题是含糊的、含糊的、不完整的或修辞性的,无法以目前的形式得到合理的回答。”
- 这里问的是什么:我认为问题本身已经很清楚了,上面的几段文字让它更清楚,
- 这个问题是模棱两可的,模糊的,不完整的 em>:再说一次,没有什么含糊之处,既不模糊也不不完整,
- 或修辞:事实并非如此:我的问题的答案不是显而易见的,
- 并且无法合理回答 >:已经有几个人对这个问题给出了很好的回答,说明这个问题是可以合理回答的。
如何对答案进行评分并确定可接受的答案似乎也很明显。如果答案给出了 XML 错误的充分理由,则该答案很可能会被投票通过,然后被接受。
Reading StackOverflow and listening the podcasts by Joel Spolsky and Jeff Atwood, I start to believe that many developers hate using XML or at least try to avoid using XML as much as possible for storing or exchanging data.
On the other hand, I enjoy using XML a lot for several reasons:
- XML serialization is implemented in most modern languages and is extremely easy to use,
- Being slower than binary serialization, XML serialization is very useful when it comes to using the same data from several programming languages or where it is intended to be read and understand, even for debugging, by an human (JSON, for example, is more difficult to understand),
- XML supports unicode, and when used properly, there are no problems with different encoding, characters, etc.
- There are plenty of tools which makes it easy to work with XML data. XSLT is an example, making it easy to present and to transform data. XPath is another one, making it easy to search for data,
- XML can be stored in some SQL servers, which enables the scenarios when data which is too complicated to be easily stored in SQL tables must be saved and manipulated; JSON or binary data, for example, cannot be manipulated through SQL directly (except by manipulating strings, which is crazy in most situations),
- XML does not require any applications to be installed. If I want my app to use a database, I must install a database server first. If I want my app to use XML, I don't have to install anything,
- XML is much more explicit and extensible than, for example, Windows Registry or INI files,
- In most cases, there are no CR-LF problems, thanks to the level of abstraction provided by XML.
So, taking in account all the benefits of using XML, why so many developers hate using it? IMHO, the only problem with it is that:
- XML is too verbose and requires much more place than most other forms of data, especially when it comes to Base64 encoding.
Of course, there are many scenarios where XML doesn't fit at all. Storing questions and answers of SO in an XML file on server side will be absolutely wrong. Or, when storing an AVI video or a bunch of JPG images, XML is the worst thing to use.
But what about other scenarios? What are the weaknesses of XML?
To the people who considered that this question is not a real question:
Contrary to questions like a non-closed Significant new inventions in computing since 1980, my question is a very clear question and clearly invites to explain what weaknesses the other people experience when using XML and why they dislike it. It does not invite to discuss, for example, if XML is good or bad. Neither does it require extended discussions; thus, the current answers received so far are short and precise and provide enough info I wanted.
But it is a wiki, since there cannot be an unique good answer to this question.
According to SO, "not a real question" is a question where "It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, or rhetorical and cannot be reasonably answered in its current form."
- What is being asked here: I think the question itself is very clear, and several paragraphs of text above makes it even clearer,
- This question is ambiguous, vague, incomplete: again, there is nothing ambiguous, neither vague nor incomplete,
- or rhetorical: it is not: the answer to my question is not something obvious,
- and cannot be reasonably answered: several people already gave great answers to the question, showing that the question can be answered reasonably.
It also seems quite obvious how to rate the answers and determine the accepted answer. If the answer gives good reasons of what's wrong with XML, there are chances that this answer will be voted up, then accepted.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
一些缺点:
Some weaknesses:
我不是问这个问题的合适人选,因为我自己就是 xml 的忠实粉丝。不过,我可以告诉您我听到的主要抱怨之一:
工作起来困难。在这里,“困难”意味着需要了解 API,并且需要编写相对较多的代码来解析 xml。虽然我不会说这真的那么难,但我只能同意,当使用支持动态创建对象的语言时,可以更轻松地访问用于描述对象的语言。
I'm not the right person to ask, as I am a big fan of xml myself. However, I can tell you one of the main complaints that I have heard:
It is hard to work with. Here, hard means that it takes knowing an API and that you will need to write relatively much code to parse your xml. While I wouldn't say that it's really all that hard, I can only agree that a language that is made to describe objects, can be accessed more easily when using a language that supports dynamically created objects.
我认为一般来说,这种反应只是因为 XML 被过度使用。
然而,如果说我对 XML 有一个非常讨厌的词的话,那就是命名空间。命名空间问题造成的生产力损失是可怕的。
I think in general the reaction is simply because XML is overused.
However, if there is one word I hate about XML, with a passion, is namespaces. The lost productivity around namespace problems is horrific.
XML 源自标记语言的鼻祖 SGML。 SGML 以及扩展 XML 的目的是注释文本。 XML 很好地做到了这一点,并且拥有广泛的工具来增强其针对各种应用程序的便利性。
在我看来,问题在于 XML 被频繁使用,不是为了注释文本,而是为了表示结构化数据,这是一个微妙但重要的区别。实际上,由于多种原因,结构化数据需要简洁。性能是一个显而易见的问题,尤其是在带宽有限的情况下。这可能是 JSON 在 Web 应用程序中如此流行的主要原因之一。线上简洁的数据结构表示意味着更好的可扩展性。
不幸的是,如果没有额外的空白填充,JSON 的可读性就不是很好,而空白填充几乎总是被省略。另一方面,如果您曾经尝试使用命令行编辑器编辑大型 XML 文件,这也可能非常尴尬。
就我个人而言,我发现 YAML 在两个极端之间取得了很好的平衡。比较以下内容(从 yaml.org 复制并进行细微更改)。
YAML:
XML:
它们都表示相同的数据,但 YAML 小了 30% 以上,并且可以说更具可读性。您希望使用文本编辑器修改哪一个?有许多库可用于解析和发出 YAML(即 Java 开发人员的 Snakeyaml)。
与所有事情一样,为正确的工作使用正确的工具是最好遵循的规则。
XML descends from SGML, the great-granddaddy of markup languages. The purpose of SGML and by extension XML is to annotate text. XML does this well and has a wide range of tools that increase its facility for a variety of applications.
The problem, as I see it, is that XML is frequently used, not to annotate text, but to represent structured data, which is a subtle but important difference. In practical terms, structured data needs to be concise for a variety of reasons. Performance is an obvious one, especially when bandwidth is limited. This is probably one of the main reasons why JSON is so popular for web applications. Concise data structure representation on the wire means better scalability.
Unfortunately, JSON is not very readable without extra whitespace padding, which is almost always omitted. On the other hand, if you have ever tried editing a large XML file using a command-line editor, it can be very awkward as well.
Personally, I find that YAML strikes a nice balance between the two extremes. Compare the following (copied from yaml.org with minor changes).
YAML:
XML:
They both represent the same data, but the YAML is over 30% smaller and arguably more readable. Which would you prefer to have to modify with a text editor? There are many libraries available to parse and emit YAML (i.e. snakeyaml for Java developers).
As with everything, the right tool for the right job is the best rule to follow.
我最喜欢的棘手问题是使用属性的 XML 序列化格式 - 例如 XAML。
这是可行的:
这不行:
XAML 反序列化在从 XML 流读取属性值时分配属性值。因此,在第二个示例中,当分配
SelectedItem
属性时,控件的ItemsSource
尚未设置,并且SelectedItem
属性正在被设置分配给已知存在的项目。如果您使用 Visual Studio 创建 XAML 文件,那么一切都会很酷,因为 Visual Studio 会维护属性的顺序。但是,在某些 XML 工具中修改您的 XAML,当它认为属性的顺序并不重要时,这些工具会相信 XML 建议,天哪,您会受到伤害吗?
My favorite nasty problem is with XML serialization formats that use attributes - like XAML.
This works:
This doesn't:
XAML deserialization assigns property values as they're read from the XML stream. So in the second example, when the
SelectedItem
property is assigned, the control'sItemsSource
hasn't been set yet, and theSelectedItem
property is being assigned to an item that yet know exists.If you're using Visual Studio to create your XAML files, everything will be cool, because Visual Studio maintains the ordering of attributes. But modify your XAML in some XML tool that believes the XML recommendation when it says that the ordering of attributes is not significant, and boy are you in a world of hurt.