我应该使用 XPath 还是只使用 DOM？

发布于 2024-10-20 04:48:10 字数 2466 浏览 9 评论 0原文

我有一堆分层数据存储在 XML 文件中。我使用 TinyXML 将其封装在手工制作的类后面。给定一个将源签名描述为一组（频率、级别）对的 XML 片段，有点像这样：

<source>
  <sig><freq>1000</freq><level>100</level><sig>
  <sig><freq>1200</freq><level>110</level><sig>
</source>

我用以下方式提取对：

std::vector< std::pair<double, double> > signature() const
{
    std::vector< std::pair<double, double> > sig;
    for (const TiXmlElement* sig_el = node()->FirstChildElement ("sig");
        sig_el;
        sig_el = sig_el->NextSiblingElement("sig"))
    {
        const double level = boost::lexical_cast<double> (sig_el->FirstChildElement("level")->GetText());
        const double freq =  boost::lexical_cast<double> (sig_el->FirstChildElement("freq")->GetText());
        sig.push_back (std::make_pair (freq, level));
    }
    return sig;
}

其中 node() 指向节点。

问题：使用 XPath 库我是否可以获得更整洁、更优雅、更易于维护或以任何其他方式更好的代码？

更新：我已经尝试使用 TinyXPath 两种方式。它们实际上都不起作用，这显然对它们不利。我是否在做一些根本性错误的事情？如果这就是 XPath 的样子，我认为它不会给我带来任何好处。

std::vector< std::pair<double, double> > signature2() const
{
    std::vector< std::pair<double, double> > sig;
    TinyXPath::xpath_processor source_proc (node(), "sig");
    const unsigned n_nodes = source_proc.u_compute_xpath_node_set();
    for (unsigned i = 0; i != n_nodes; ++i)
    {
        TiXmlNode* s = source_proc.XNp_get_xpath_node (i);
        const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
        const double freq =  TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
        sig.push_back (std::make_pair (freq, level));
    }
    return sig;
}

std::vector< std::pair<double, double> > signature3() const
{
    std::vector< std::pair<double, double> > sig;
    int i = 1;
    while (TiXmlNode* s = TinyXPath::xpath_processor (node(), 
        ("sig[" + boost::lexical_cast<std::string>(i++) + "]/*").c_str()).
        XNp_get_xpath_node(0))
    {
        const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
        const double freq =  TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
        sig.push_back (std::make_pair (freq, level));
    }
    return sig;
}

作为次要问题，如果是这样，我应该使用哪个 XPath 库？

原文

I have a bunch of hierarchical data stored in an XML file. I am wrapping that up behind hand-crafted classes using TinyXML. Given an XML fragment that describes a source signature as a set of (frequency, level) pairs a bit like this:

<source>
  <sig><freq>1000</freq><level>100</level><sig>
  <sig><freq>1200</freq><level>110</level><sig>
</source>

i am extracting the pairs with this:

std::vector< std::pair<double, double> > signature() const
{
    std::vector< std::pair<double, double> > sig;
    for (const TiXmlElement* sig_el = node()->FirstChildElement ("sig");
        sig_el;
        sig_el = sig_el->NextSiblingElement("sig"))
    {
        const double level = boost::lexical_cast<double> (sig_el->FirstChildElement("level")->GetText());
        const double freq =  boost::lexical_cast<double> (sig_el->FirstChildElement("freq")->GetText());
        sig.push_back (std::make_pair (freq, level));
    }
    return sig;
}

where node() is pointing at the <source> node.

Question: would I get a neater, more elegant, more maintainable or in any other way better piece of code using an XPath library instead?

Update: I have tried it using TinyXPath two ways. Neither of them actually work, which is a big point against them obviously. Am I doing something fundamentally wrong? If this is what it is going to look like with XPath, I don't think it is getting me anything.

std::vector< std::pair<double, double> > signature2() const
{
    std::vector< std::pair<double, double> > sig;
    TinyXPath::xpath_processor source_proc (node(), "sig");
    const unsigned n_nodes = source_proc.u_compute_xpath_node_set();
    for (unsigned i = 0; i != n_nodes; ++i)
    {
        TiXmlNode* s = source_proc.XNp_get_xpath_node (i);
        const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
        const double freq =  TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
        sig.push_back (std::make_pair (freq, level));
    }
    return sig;
}

std::vector< std::pair<double, double> > signature3() const
{
    std::vector< std::pair<double, double> > sig;
    int i = 1;
    while (TiXmlNode* s = TinyXPath::xpath_processor (node(), 
        ("sig[" + boost::lexical_cast<std::string>(i++) + "]/*").c_str()).
        XNp_get_xpath_node(0))
    {
        const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
        const double freq =  TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
        sig.push_back (std::make_pair (freq, level));
    }
    return sig;
}

As a secondary issue, if so, which XPath library should I be using?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦罢 2024-10-27 04:48:10

一般来说，我倾向于更喜欢基于 XPath 的解决方案，因为它们简洁且具有多功能性，但说实话，就您的情况而言，我认为使用 XPath 不会给您的签名带来很多好处。

原因如下：

代码优雅
您的代码很好而且很紧凑，使用 XPath 表达式不会变得更好。

内存占用
除非您的输入 XML 配置文件很大（一种矛盾），并且 DOM 解析需要占用大量内存（没有证据表明使用 XPath 是决定性的解决办法），否则我会坚持使用 DOM。

执行速度
在这样一个简单的 XML 树上，执行速度应该是相当的。
如果存在差异，则可能是 TinyXml 的优势，因为给定节点下 freq 和 level 标记的搭配。

库和外部参考这是决定性的一点。
C++ 世界中领先的 XPath 引擎是 XQilla。
它支持 XQuery（因此同时支持 XPath 1.0 和 2.0），并得到 Oracle 的支持，因为它是由负责 Berkeley DB 产品（包括精确的 Berkeley DB XML – 它使用 XQilla）的小组开发的。
对于希望使用 XQilla 的 C++ 开发人员来说，问题在于他们有多种替代方案

，使用 Xerces 2 和 XQilla 2.1 会导致代码中存在强制转换。
使用 XQilla 2.2+ 并使用 Xerces 3（此处不需要转换）
使用 TinyXPath 与 TinyXml 很好地集成，但是
然而，
混合 Xerces 和tinyXml

有许多限制（例如不支持命名空间）总之，在您的情况下，仅仅为了它而切换到 XPath，即使有任何好处，也不会带来什么好处。

然而，XPath 是当今开发人员工具箱中非常强大的工具，没有人可以忽视它。
如果您只想练习一个简单的示例，那么您的示例就很好了。然后，我会记住上面的几点，并且可能会使用 TinyXPath 。

回复收藏 0 原文

凉城凉梦凉人心 2024-10-27 04:48:10

如果您需要灵活地对提取的值进行运行时更改，则需要 XPath。

但是，如果您不太可能需要这种灵活性，或者重新编译来扩展您正在提取的内容就不是问题，并且事情不会经常更改。或者如果用户永远不需要更新表达式。或者如果你拥有的东西适合你。您不需要 XPath，并且有很多应用程序不使用它。

至于它是否更具可读性，是的，它肯定可以。但是，如果您只是提取一些值，我会质疑是否需要提取另一个库。

我当然会更好地记录您当前拥有的内容，因为那些不熟悉tinyxml或xml库的人可能不确定它在做什么，但它并不难理解。

我不确定 XPath 会增加什么样的开销，但我怀疑它可能会增加一些。对于大多数人来说，我想他们根本不会注意到任何差异，这可能不是您或大多数人关心的问题，但请注意这一点，以防万一这是您担心的事情。

如果您确实想使用 xpath 库，那么我只能说，我已经使用了 Xerces-C++ 附带的库，并且学习起来并不太难。我以前使用过TinyXML，这里有人提到过TinyXPath。我没有这方面的经验，但它是可用的。

当我第一次学习 XPath 表达式时，我还发现此链接很有用。
https://www.w3schools.com/xml/xpath_intro.asp

回复收藏 0 原文

星星的轨迹 2024-10-27 04:48:10

XPath 就是为此而设计的，所以如果您使用它，您的代码当然会“更好”。

我无法推荐特定的 C++ XPath 库，但尽管在大多数情况下使用一个库是正确的决定，但在添加库之前请先进行成本/收益分析。也许是YAGNI。

回复收藏 0 原文

蛮可爱 2024-10-27 04:48:10

此 XPath 表达式：

/*/sig[$pN]/*

选择第 $pN 个 sigfreq 和 level 对） > XML 文档顶部元素的子元素。

字符串$pN应替换为特定的正整数，例如：

/*/sig[2]/*

选择这两个元素：

<freq>1200</freq><level>110</level>

使用 XPath 表达式作为这显然比提供的 C++ 代码更短且易于理解。

另一个优点是，可以在 C# 或 Java 或 ... 程序中使用相同的 XPath 表达式，而无需以任何方式修改它 - 因此遵守 XPath 会带来非常高的可移植性。

This XPath expression:

/*/sig[$pN]/*

selects all children elements (just the pair freq and level) of the $pN-th sig child of the top element of the XML document.

The string $pN should be substituted with a specific positive integer, for example:

/*/sig[2]/*

selects these two elements:

<freq>1200</freq><level>110</level>

Using an XPath expression as this is obviously much shorter and understandable that the provided C++ code.

Another advantage is that the same XPath expression can be used from a C# or Java or ... program, without having to modify it in any way -- thus adhering to XPath results in very high degree of portability.

回复收藏 0 原文

~没有更多了~