XML 的缺点是什么?

发布于 2024-09-16 08:26:35 字数 1932 浏览 4 评论 0原文

阅读 StackOverflow 并收听 Joel Spolsky 和 ​​Jeff Atwood 的播客,我开始相信许多开发人员讨厌使用 XML,或者至少尝试尽可能避免使用 XML 来存储或交换数据

另一方面,我非常喜欢使用 XML,原因如下:

  • XML 序列化在大多数现代语言中实现,并且非常易于使用
  • 比二进制序列化慢,XML 序列化在以下情况下非常有用:当使用来自多种编程语言的相同数据,或者想要由人类读取和理解甚至调试时(例如,JSON 更难理解),
  • XML 支持unicode,如果使用得当,不同的编码、字符等都不会出现问题。
  • 有很多工具可以轻松处理 XML 数据。 XSLT 就是一个示例,它可以轻松地呈现和转换数据。 XPath是另一种,可以轻松搜索数据,
  • XML可以存储在某些SQL服务器中,这使得当数据太复杂而难以轻松存储在SQL表中时< /strong> 必须保存和操作;例如,JSON 或二进制数据不能直接通过 SQL 进行操作(除非通过操作字符串,这在大多数情况下是疯狂的),
  • XML 不需要安装任何应用程序。如果我希望我的应用程序使用数据库,我必须首先安装数据库服务器。如果我希望我的应用程序使用 XML,我不必安装任何东西
  • XML 比 Windows 注册表或 INI 文件等更明确且可扩展
  • 在大多数情况下,由于 XML 提供的抽象级别,不存在 CR-LF 问题

那么,考虑到使用 XML 的所有好处,为什么这么多开发人员讨厌使用它呢?恕我直言,它唯一的问题是:

  • XML 太冗长,并且比大多数其他形式的数据需要更多的空间,尤其是在 Base64 编码方面。

当然,有很多场景 XML 根本不适合。将 SO 的问题和答案存储在服务器端的 XML 文件中是绝对错误的。或者,当存储 AVI 视频或一堆 JPG 图像时,XML 是最不好使用的。

但其他场景呢? XML 的缺点是什么?


对于那些认为这个问题不是真正问题的人:

与非封闭的问题相反 自 1980 年以来计算领域的重大新发明,我的问题是一个非常明确的问题,明确要求解释其他人在使用 XML 时遇到的弱点以及他们为什么不喜欢它。例如,它不邀请讨论XML 是好是坏。它也不需要长时间的讨论;因此,到目前为止收到的当前答案简短而准确,并提供了我想要的足够信息。

但是它是一个维基,因为这个问题不可能有一个唯一好的答案。

根据 SO 的说法,“不是真正的问题”是指“很难说出这里问的是什么。这个问题是含糊的、含糊的、不完整的或修辞性的,无法以目前的形式得到合理的回答。”

  • 这里问的是什么:我认为问题本身已经很清楚了,上面的几段文字让它更清楚,
  • 这个问题是模棱两可的,模糊的,不完整的 em>:再说一次,没有什么含糊之处,既不模糊也不不完整,
  • 或修辞:事实并非如此:我的问题的答案不是显而易见的,
  • 并且无法合理回答 >:已经有几个人对这个问题给出了很好的回答,说明这个问题是可以合理回答的。

如何对答案进行评分并确定可接受的答案似乎也很明显。如果答案给出了 XML 错误的充分理由,则该答案很可能会被投票通过,然后被接受。

Reading StackOverflow and listening the podcasts by Joel Spolsky and Jeff Atwood, I start to believe that many developers hate using XML or at least try to avoid using XML as much as possible for storing or exchanging data.

On the other hand, I enjoy using XML a lot for several reasons:

  • XML serialization is implemented in most modern languages and is extremely easy to use,
  • Being slower than binary serialization, XML serialization is very useful when it comes to using the same data from several programming languages or where it is intended to be read and understand, even for debugging, by an human (JSON, for example, is more difficult to understand),
  • XML supports unicode, and when used properly, there are no problems with different encoding, characters, etc.
  • There are plenty of tools which makes it easy to work with XML data. XSLT is an example, making it easy to present and to transform data. XPath is another one, making it easy to search for data,
  • XML can be stored in some SQL servers, which enables the scenarios when data which is too complicated to be easily stored in SQL tables must be saved and manipulated; JSON or binary data, for example, cannot be manipulated through SQL directly (except by manipulating strings, which is crazy in most situations),
  • XML does not require any applications to be installed. If I want my app to use a database, I must install a database server first. If I want my app to use XML, I don't have to install anything,
  • XML is much more explicit and extensible than, for example, Windows Registry or INI files,
  • In most cases, there are no CR-LF problems, thanks to the level of abstraction provided by XML.

So, taking in account all the benefits of using XML, why so many developers hate using it? IMHO, the only problem with it is that:

  • XML is too verbose and requires much more place than most other forms of data, especially when it comes to Base64 encoding.

Of course, there are many scenarios where XML doesn't fit at all. Storing questions and answers of SO in an XML file on server side will be absolutely wrong. Or, when storing an AVI video or a bunch of JPG images, XML is the worst thing to use.

But what about other scenarios? What are the weaknesses of XML?


To the people who considered that this question is not a real question:

Contrary to questions like a non-closed Significant new inventions in computing since 1980, my question is a very clear question and clearly invites to explain what weaknesses the other people experience when using XML and why they dislike it. It does not invite to discuss, for example, if XML is good or bad. Neither does it require extended discussions; thus, the current answers received so far are short and precise and provide enough info I wanted.

But it is a wiki, since there cannot be an unique good answer to this question.

According to SO, "not a real question" is a question where "It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, or rhetorical and cannot be reasonably answered in its current form."

  • What is being asked here: I think the question itself is very clear, and several paragraphs of text above makes it even clearer,
  • This question is ambiguous, vague, incomplete: again, there is nothing ambiguous, neither vague nor incomplete,
  • or rhetorical: it is not: the answer to my question is not something obvious,
  • and cannot be reasonably answered: several people already gave great answers to the question, showing that the question can be answered reasonably.

It also seems quite obvious how to rate the answers and determine the accepted answer. If the answer gives good reasons of what's wrong with XML, there are chances that this answer will be voted up, then accepted.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

半边脸i 2024-09-23 08:26:35
<xml>
    <noise>
        The
    </noise>
    <adjective>
        main
    </adjective>
    <noun>
        weakness
    </noun>
    <noise>
        of
    </noise>
    <subject>
        XML
    </subject>
    <noise>
        ,
    </noise>
    <whocares>
        in my opinion
    </whocares>
    <noise>
        ,
    </noise>
    <wildgeneralisation>
        is its verbosity
    </wildgeneralisation>
    <noise>
        .
    </noise>
</xml>
<xml>
    <noise>
        The
    </noise>
    <adjective>
        main
    </adjective>
    <noun>
        weakness
    </noun>
    <noise>
        of
    </noise>
    <subject>
        XML
    </subject>
    <noise>
        ,
    </noise>
    <whocares>
        in my opinion
    </whocares>
    <noise>
        ,
    </noise>
    <wildgeneralisation>
        is its verbosity
    </wildgeneralisation>
    <noise>
        .
    </noise>
</xml>
美胚控场 2024-09-23 08:26:35

一些缺点:

  • 将 xml 文件和外部资源关联起来有些困难,这就是为什么新的 Office 文档格式使用包含框架 xml 文件和捆绑在一起的资源文件的 zip 信封。使用 Base64 编码的另一种选择非常冗长,并且不允许良好的随机访问,这引出了下一点:
  • 随机访问很困难。读取 xml 文件的两种传统模式(构造 DOM 或只进 SAX 样式读取)都不允许真正的随机访问。
  • 对文件的不同部分进行并发写入访问很困难,这就是为什么在 Windows 可执行清单中使用它很容易出错。
  • xml 文件使用什么编码?严格来说,您首先猜测编码,然后读取文件并验证编码是否正确。
  • 对文件的某些部分进行版本控制很困难。因此,如果您想提供粒度版本控制,则需要拆分数据。这不仅仅是文件格式问题,还因为工具通常提供每个文件的语义 - 版本控制工具、同步工具(例如 DropBox)等。

Some weaknesses:

  • It is somewhat difficult to associate xml files and external resources, which is why the new Office document formats use a zip envelope that includes a skeleton xml file and resource files bundled together. The other option of using base64 encoding is very verbose and doesn't allow good random access, which brings one to the next point:
  • Random access is difficult. Neither of the two traditional modes of reading an xml file - construct a DOM or forward-only SAX style reading allow for truly random access.
  • Concurrent write access to different parts of the file is difficult, which is why its use in Windows executable manifests is error prone.
  • What encoding does an xml file use? Strictly speaking you guess the encoding first, then read the file and verify the encoding was right.
  • It is difficult to version portions of a file. Therefore if you want to provide granular versioning, you need to split your data. This is not just a file format issue, but also due to the fact that tools generally provide per-file semantics - version control tools, sync tools like DropBox, etc.
空宴 2024-09-23 08:26:35

我不是问这个问题的合适人选,因为我自己就是 xml 的忠实粉丝。不过,我可以告诉您我听到的主要抱怨之一:

工作起来困难。在这里,“困难”意味着需要了解 API,并且需要编写相对较多的代码来解析 xml。虽然我不会说这真的那么难,但我只能同意,当使用支持动态创建对象的语言时,可以更轻松地访问用于描述对象的语言。

I'm not the right person to ask, as I am a big fan of xml myself. However, I can tell you one of the main complaints that I have heard:

It is hard to work with. Here, hard means that it takes knowing an API and that you will need to write relatively much code to parse your xml. While I wouldn't say that it's really all that hard, I can only agree that a language that is made to describe objects, can be accessed more easily when using a language that supports dynamically created objects.

_失温 2024-09-23 08:26:35

我认为一般来说,这种反应只是因为 XML 被过度使用。

然而,如果说我对 XML 有一个非常讨厌的词的话,那就是命名空间。命名空间问题造成的生产力损失是可怕的。

I think in general the reaction is simply because XML is overused.

However, if there is one word I hate about XML, with a passion, is namespaces. The lost productivity around namespace problems is horrific.

柒七 2024-09-23 08:26:35

XML 源自标记语言的鼻祖 SGML。 SGML 以及扩展 XML 的目的是注释文本。 XML 很好地做到了这一点,并且拥有广泛的工具来增强其针对各种应用程序的便利性。

在我看来,问题在于 XML 被频繁使用,不是为了注释文本,而是为了表示结构化数据,这是一个微妙但重要的区别。实际上,由于多种原因,结构化数据需要简洁。性能是一个显而易见的问题,尤其是在带宽有限的情况下。这可能是 JSON 在 Web 应用程序中如此流行的主要原因之一。线上简洁的数据结构表示意味着更好的可扩展性。

不幸的是,如果没有额外的空白填充,JSON 的可读性就不是很好,而空白填充几乎总是被省略。另一方面,如果您曾经尝试使用命令行编辑器编辑大型 XML 文件,这也可能非常尴尬。

就我个人而言,我发现 YAML 在两个极端之间取得了很好的平衡。比较以下内容(从 yaml.org 复制并进行细微更改)。

YAML:

invoice: 34843
  date: 2001-01-23
  billto: &id001
    given: Chris
    family: Dumars
    address:
      lines: |
        458 Walkman Dr.
        Suite #292
      city: Royal Oak
      state: MI
      postal: 48046
  shipto: *id001
  product:
  - sku: BL394D
    quantity: 4
    description: Basketball
    price: 450.00
  - sku: BL4438H
    quantity: 1
    description: Super Hoop
    price: 2392.00
  tax : 251.42
  total: 4443.52
  comments: >
    Late afternoon is best.
    Backup contact is Nancy
    Billsmer @ 338-4338.

XML:

<invoice>
   <number>34843</number>
   <date>2001-01-03</date>
   <billto id="id001">
      <given>Chris</given>
      <family>Dumars</family>
      <address>
        <lines>
          458 Walkman Dr.
          Suite #292
        </lines>
        <city>Royal Oak</city>
        <state>MI</state>
        <postal>48046</postal>
      </address>
   </billto>
   <shipto xref="id001" />
   <products>
      <product>
        <sku>BL394D</sku>
        <quantity>4</quantity>
        <description>Basketball</description>
        <price>450.00</price>
      </product>
      <product>
        <sku>BL4438</sku>
        <quantity>1</quantity>
        <description>Super Hoop</description>
        <price>2392.00</price>
      </product>
   </products>
   <tax>251.42</tax>
   <total>4443.52</total>
   <comments>
    Late afternoon is best. Backup contact is Nancy Billsmer @ 338-4338
   </comments>
</invoice>

它们都表示相同的数据,但 YAML 小了 30% 以上,并且可以说更具可读性。您希望使用文本编辑器修改哪一个?有许多库可用于解析和发出 YAML(即 Java 开发人员的 Snakeyaml)。

与所有事情一样,为正确的工作使用正确的工具是最好遵循的规则。

XML descends from SGML, the great-granddaddy of markup languages. The purpose of SGML and by extension XML is to annotate text. XML does this well and has a wide range of tools that increase its facility for a variety of applications.

The problem, as I see it, is that XML is frequently used, not to annotate text, but to represent structured data, which is a subtle but important difference. In practical terms, structured data needs to be concise for a variety of reasons. Performance is an obvious one, especially when bandwidth is limited. This is probably one of the main reasons why JSON is so popular for web applications. Concise data structure representation on the wire means better scalability.

Unfortunately, JSON is not very readable without extra whitespace padding, which is almost always omitted. On the other hand, if you have ever tried editing a large XML file using a command-line editor, it can be very awkward as well.

Personally, I find that YAML strikes a nice balance between the two extremes. Compare the following (copied from yaml.org with minor changes).

YAML:

invoice: 34843
  date: 2001-01-23
  billto: &id001
    given: Chris
    family: Dumars
    address:
      lines: |
        458 Walkman Dr.
        Suite #292
      city: Royal Oak
      state: MI
      postal: 48046
  shipto: *id001
  product:
  - sku: BL394D
    quantity: 4
    description: Basketball
    price: 450.00
  - sku: BL4438H
    quantity: 1
    description: Super Hoop
    price: 2392.00
  tax : 251.42
  total: 4443.52
  comments: >
    Late afternoon is best.
    Backup contact is Nancy
    Billsmer @ 338-4338.

XML:

<invoice>
   <number>34843</number>
   <date>2001-01-03</date>
   <billto id="id001">
      <given>Chris</given>
      <family>Dumars</family>
      <address>
        <lines>
          458 Walkman Dr.
          Suite #292
        </lines>
        <city>Royal Oak</city>
        <state>MI</state>
        <postal>48046</postal>
      </address>
   </billto>
   <shipto xref="id001" />
   <products>
      <product>
        <sku>BL394D</sku>
        <quantity>4</quantity>
        <description>Basketball</description>
        <price>450.00</price>
      </product>
      <product>
        <sku>BL4438</sku>
        <quantity>1</quantity>
        <description>Super Hoop</description>
        <price>2392.00</price>
      </product>
   </products>
   <tax>251.42</tax>
   <total>4443.52</total>
   <comments>
    Late afternoon is best. Backup contact is Nancy Billsmer @ 338-4338
   </comments>
</invoice>

They both represent the same data, but the YAML is over 30% smaller and arguably more readable. Which would you prefer to have to modify with a text editor? There are many libraries available to parse and emit YAML (i.e. snakeyaml for Java developers).

As with everything, the right tool for the right job is the best rule to follow.

唠甜嗑 2024-09-23 08:26:35

我最喜欢的棘手问题是使用属性的 XML 序列化格式 - 例如 XAML。

这是可行的:

<ListBox ItemsSource="{Binding Items}" SelectedItem="{Binding CurrentSelection}"/>

这不行:

<ListBox SelectedItem="{Binding CurrentSelection}" ItemsSource="{Binding Items}"/>

XAML 反序列化在从 XML 流读取属性值时分配属性值。因此,在第二个示例中,当分配 SelectedItem 属性时,控件的 ItemsSource 尚未设置,并且 SelectedItem 属性正在被设置分配给已知存在的项目。

如果您使用 Visual Studio 创建 XAML 文件,那么一切都会很酷,因为 Visual Studio 会维护属性的顺序。但是,在某些 XML 工具中修改您的 XAML,当它认为属性的顺序并不重要时,这些工具会相信 XML 建议,天哪,您会受到伤害吗?

My favorite nasty problem is with XML serialization formats that use attributes - like XAML.

This works:

<ListBox ItemsSource="{Binding Items}" SelectedItem="{Binding CurrentSelection}"/>

This doesn't:

<ListBox SelectedItem="{Binding CurrentSelection}" ItemsSource="{Binding Items}"/>

XAML deserialization assigns property values as they're read from the XML stream. So in the second example, when the SelectedItem property is assigned, the control's ItemsSource hasn't been set yet, and the SelectedItem property is being assigned to an item that yet know exists.

If you're using Visual Studio to create your XAML files, everything will be cool, because Visual Studio maintains the ordering of attributes. But modify your XAML in some XML tool that believes the XML recommendation when it says that the ordering of attributes is not significant, and boy are you in a world of hurt.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文