完全用 Perl 编写的 XML 解析器的优点和缺点是什么

发布于 2024-11-25 12:53:41 字数 64 浏览 1 评论 0原文

完全用 Perl 编写的 XML 解析器与仅提供解析器接口(如 GNOME 的 LibXML)的优点和缺点是什么?

What are the Pros and Cons of an XML-Parser written completely in Perl to one only providing an interface to a Parser like GNOME's LibXML?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

故人如初 2024-12-02 12:53:41

优点:更便携且更易于安装(您不需要针对 C 库编译纯 Perl 模块,它们将在 Perl 二进制文件运行的任何地方运行,您不需要 C 编译器和开发库如果您想在任意机器上构建它,则链接到它)。

缺点:与高度优化的 C 相比,纯 Perl(或任何与此相关的动态语言)相对较慢。

Pro: More portable and easier to install (you don't need to compile pure Perl modules against C libraries, they'll run anywhere that the perl binary will run, you don't need to have a C compiler and the dev- libraries to link against if you want to build it on an arbitrary machine).

Con: Pure Perl (or any dynamic language for that matter) is comparatively slow when compared to heavily optimised C.

恬淡成诗 2024-12-02 12:53:41

实际上有几个; XML::SAX::PurePerlXML::Parser::Lite,但它们都是慢的并且不完整:上次我检查 XML::SAX::PurePerl 仍然有一些错误,尽管您可能在现实生活中找不到它们,并且 XML::Parser::Lite 只应该解析由肥皂。

无论如何,大多数系统都带有 expat 或 libxml2,因此在实践中,依赖外部库似乎并不是一个大问题。即使在 Windows 上,Activestate Perl 和 Strawberry Perl 中也包含 expat。

XML 解析器实际上相当复杂(您需要解析 XML,还需要解析 DTD,处理实体...),因此除了出于娱乐^W学习目的之外,无需重新发明这个特定的轮子。

There are actually a few; XML::SAX::PurePerl and XML::Parser::Lite for example, but they are both slow and incomplete: the last time I checked XML::SAX::PurePerl still had some bugs, although you probably would not find them in Real Life, and XML::Parser::Lite is only supposed to parse the subset of XML used by SOAP.

In any case most systems come with either expat or libxml2, so in practice it doesn't seem to be such a big problem to rely on external libraries. Even on Windows expat is included both in Activestate Perl and in Strawberry Perl.

An XML parser is quite complex actually (you need to parse XML, but also DTDs, deal with entities...) so there is no need to re-invent this specific wheel, apart for entertainement^Wlearning purposes.

奢望 2024-12-02 12:53:41

正如米罗德指出的,不完整有时也是缺点之一。

如果您必须根据 XSD 模式验证 XML,事情可能会变得更糟。
有一些模块试图解决这个问题,例如 XML::编译,或XML::Pastor,但是你必须容忍速度。

为了给您提供一些数字,我将讨论我编写的一个 Perl 程序,该程序用于验证必须符合架构的 XML 数据。我的程序使用 XML::ParserMooseX::Types,验证 5 MiB XML 文件最多可能需要 10 秒。

另一方面,

xmllint --schema /path_to/schema.xsd data.xml

在不到一秒的时间内完成相同的任务。

我在解析阶段使用 VTD-XML 恢复了一些速度(和内存),但仍然有使用 Perl 验证数据,因为 VTD-XML 不是(仍然)验证解析器。

As mirod pointed out, incompleteness is sometime among Cons.

If you have to validate XML according to an XSD schema, things may get worse.
There are modules that try to address this problem, like XML::Compile, or XML::Pastor, but you have to be tolerant on speed.

To give you some numbers, I'll talk about a Perl program I wrote to validate XML data that have to be schema compliant. My program uses XML::Parser, and MooseX::Types, and can take up to 10 seconds to validate a 5 MiB XML file.

On the other side,

xmllint --schema /path_to/schema.xsd data.xml

does the same task in a fraction of a second.

I got some speed ( and memory ) back using VTD-XML in the parsing phase, but still I have to validate data with Perl, because VTD-XML isn't ( still ) a validating parser.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文