我应该使用 XPath 还是只使用 DOM?
我有一堆分层数据存储在 XML 文件中。我使用 TinyXML 将其封装在手工制作的类后面。给定一个将源签名描述为一组(频率、级别)对的 XML 片段,有点像这样:
<source>
<sig><freq>1000</freq><level>100</level><sig>
<sig><freq>1200</freq><level>110</level><sig>
</source>
我用以下方式提取对:
std::vector< std::pair<double, double> > signature() const
{
std::vector< std::pair<double, double> > sig;
for (const TiXmlElement* sig_el = node()->FirstChildElement ("sig");
sig_el;
sig_el = sig_el->NextSiblingElement("sig"))
{
const double level = boost::lexical_cast<double> (sig_el->FirstChildElement("level")->GetText());
const double freq = boost::lexical_cast<double> (sig_el->FirstChildElement("freq")->GetText());
sig.push_back (std::make_pair (freq, level));
}
return sig;
}
其中 node() 指向 节点。
问题:使用 XPath 库我是否可以获得更整洁、更优雅、更易于维护或以任何其他方式更好的代码?
更新:我已经尝试使用 TinyXPath 两种方式。它们实际上都不起作用,这显然对它们不利。我是否在做一些根本性错误的事情?如果这就是 XPath 的样子,我认为它不会给我带来任何好处。
std::vector< std::pair<double, double> > signature2() const
{
std::vector< std::pair<double, double> > sig;
TinyXPath::xpath_processor source_proc (node(), "sig");
const unsigned n_nodes = source_proc.u_compute_xpath_node_set();
for (unsigned i = 0; i != n_nodes; ++i)
{
TiXmlNode* s = source_proc.XNp_get_xpath_node (i);
const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
const double freq = TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
sig.push_back (std::make_pair (freq, level));
}
return sig;
}
std::vector< std::pair<double, double> > signature3() const
{
std::vector< std::pair<double, double> > sig;
int i = 1;
while (TiXmlNode* s = TinyXPath::xpath_processor (node(),
("sig[" + boost::lexical_cast<std::string>(i++) + "]/*").c_str()).
XNp_get_xpath_node(0))
{
const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
const double freq = TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
sig.push_back (std::make_pair (freq, level));
}
return sig;
}
作为次要问题,如果是这样,我应该使用哪个 XPath 库?
I have a bunch of hierarchical data stored in an XML file. I am wrapping that up behind hand-crafted classes using TinyXML. Given an XML fragment that describes a source signature as a set of (frequency, level) pairs a bit like this:
<source>
<sig><freq>1000</freq><level>100</level><sig>
<sig><freq>1200</freq><level>110</level><sig>
</source>
i am extracting the pairs with this:
std::vector< std::pair<double, double> > signature() const
{
std::vector< std::pair<double, double> > sig;
for (const TiXmlElement* sig_el = node()->FirstChildElement ("sig");
sig_el;
sig_el = sig_el->NextSiblingElement("sig"))
{
const double level = boost::lexical_cast<double> (sig_el->FirstChildElement("level")->GetText());
const double freq = boost::lexical_cast<double> (sig_el->FirstChildElement("freq")->GetText());
sig.push_back (std::make_pair (freq, level));
}
return sig;
}
where node() is pointing at the <source>
node.
Question: would I get a neater, more elegant, more maintainable or in any other way better piece of code using an XPath library instead?
Update: I have tried it using TinyXPath two ways. Neither of them actually work, which is a big point against them obviously. Am I doing something fundamentally wrong? If this is what it is going to look like with XPath, I don't think it is getting me anything.
std::vector< std::pair<double, double> > signature2() const
{
std::vector< std::pair<double, double> > sig;
TinyXPath::xpath_processor source_proc (node(), "sig");
const unsigned n_nodes = source_proc.u_compute_xpath_node_set();
for (unsigned i = 0; i != n_nodes; ++i)
{
TiXmlNode* s = source_proc.XNp_get_xpath_node (i);
const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
const double freq = TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
sig.push_back (std::make_pair (freq, level));
}
return sig;
}
std::vector< std::pair<double, double> > signature3() const
{
std::vector< std::pair<double, double> > sig;
int i = 1;
while (TiXmlNode* s = TinyXPath::xpath_processor (node(),
("sig[" + boost::lexical_cast<std::string>(i++) + "]/*").c_str()).
XNp_get_xpath_node(0))
{
const double level = TinyXPath::xpath_processor(s, "level/text()").d_compute_xpath();
const double freq = TinyXPath::xpath_processor(s, "freq/text()").d_compute_xpath();
sig.push_back (std::make_pair (freq, level));
}
return sig;
}
As a secondary issue, if so, which XPath library should I be using?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一般来说,我倾向于更喜欢基于 XPath 的解决方案,因为它们简洁且具有多功能性,但说实话,就您的情况而言,我认为使用 XPath 不会给您的签名带来很多好处。
原因如下:
代码优雅
您的代码很好而且很紧凑,使用 XPath 表达式不会变得更好。
内存占用
除非您的输入 XML 配置文件很大(一种矛盾),并且 DOM 解析需要占用大量内存(没有证据表明使用 XPath 是决定性的解决办法),否则我会坚持使用 DOM。
执行速度
在这样一个简单的 XML 树上,执行速度应该是相当的。
如果存在差异,则可能是 TinyXml 的优势,因为给定节点下
freq
和level
标记的搭配。库和外部参考这是决定性的一点。
C++ 世界中领先的 XPath 引擎是 XQilla。
它支持 XQuery(因此同时支持 XPath 1.0 和 2.0),并得到 Oracle 的支持,因为它是由负责 Berkeley DB 产品(包括精确的 Berkeley DB XML – 它使用 XQilla)的小组开发的。
对于希望使用 XQilla 的 C++ 开发人员来说,问题在于他们有多种替代方案
然而,
有许多限制(例如不支持命名空间)总之,在您的情况下,仅仅为了它而切换到 XPath,即使有任何好处,也不会带来什么好处。
然而,XPath 是当今开发人员工具箱中非常强大的工具,没有人可以忽视它。
如果您只想练习一个简单的示例,那么您的示例就很好了。然后,我会记住上面的几点,并且可能会使用 TinyXPath 。
In general I tend to prefer XPath based solutions for their concision and versatility but, honestly, in your case, I don't think using XPath will bring a lot to your
signature
.Here is why:
Code elegance
Your code is nice and compact and it will not get any better with an XPath expression.
Memory footprint
Unless your input XML configuration file is huge (a kind of oxymoron) and the DOM parsing would entail a large memory footprint, for which there is no proof that using XPath would be a decisive cure, I would stick with DOM.
Execution Speed
On such a simple XML tree, execution speed should be comparable.
If there would be a difference, it would probably be in TinyXml's advantage because of the collocation of the
freq
andlevel
tags under a given node.Libraries and external references That's the decisive point.
The leading XPath engine in the C++ world is XQilla.
It supports XQuery (therefore both XPath 1.0 and 2.0) and is backed by Oracle because it's developed by the group responsible for Berkeley DB products (including precisely Berkeley DB XML – which uses XQilla).
The problem for C++ developers wishing to use XQilla is that they have several alternatives
for which there however are a number of limitations (no support for namespaces for instance)
In summary, in your case switching to XPath just for the sake of it, would bring little benefit if any.
Yet, XPath is a very powerful tool in today's developer toolbox and no one can ignore it.
If you just wish to practice on a simple example, yours is as good as any. Then, I'd keep in mind the points above and probably use TinyXPath anyway.
如果您需要灵活地对提取的值进行运行时更改,则需要 XPath。
但是,如果您不太可能需要这种灵活性,或者重新编译来扩展您正在提取的内容就不是问题,并且事情不会经常更改。或者如果用户永远不需要更新表达式。或者如果你拥有的东西适合你。您不需要 XPath,并且有很多应用程序不使用它。
至于它是否更具可读性,是的,它肯定可以。但是,如果您只是提取一些值,我会质疑是否需要提取另一个库。
我当然会更好地记录您当前拥有的内容,因为那些不熟悉tinyxml或xml库的人可能不确定它在做什么,但它并不难理解。
我不确定 XPath 会增加什么样的开销,但我怀疑它可能会增加一些。对于大多数人来说,我想他们根本不会注意到任何差异,这可能不是您或大多数人关心的问题,但请注意这一点,以防万一这是您担心的事情。
如果您确实想使用 xpath 库,那么我只能说,我已经使用了 Xerces-C++ 附带的库,并且学习起来并不太难。我以前使用过TinyXML,这里有人提到过TinyXPath。我没有这方面的经验,但它是可用的。
当我第一次学习 XPath 表达式时,我还发现此链接很有用。
https://www.w3schools.com/xml/xpath_intro.asp
You need XPath if you need the flexibility to make runtime changes to the values extracted.
But, if you're unlikely to need this kind of flexibility, or a recompile to expand what you're extracting isn't a problem and things are not being changed too often. Or if users never need to update the expressions. Or if what you have works fine for you. You don't need XPath and there are lots of applications that don't use it.
As to whether it's more readable, well yes it sure can be. But if you're just pulling out a few values I'd question the need to pull in another library.
I would certainly document what you currently have a bit better as those not familiar with tinyxml or xml libraries may not be sure what it's doing but it's not hard to understand as it is.
I'm not sure what sort of overhead XPath adds, but I suspect it may add some. For most, I guess they won't notice any difference at all and it may not be a concern to you or most people, but be aware of it in case it's something you're concerned about.
If you do want to use an xpath library then all I can say is that I've used the one that came with Xerces-C++ and it wasn't too hard to learn. I have used TinyXML before and someone here has mentioned TinyXPath. I have no experience with it but it's available.
I also found this link useful when first learning about XPath expressions.
https://www.w3schools.com/xml/xpath_intro.asp
XPath 就是为此而设计的,所以如果您使用它,您的代码当然会“更好”。
我无法推荐特定的 C++ XPath 库,但尽管在大多数情况下使用一个库是正确的决定,但在添加库之前请先进行成本/收益分析。也许是YAGNI。
XPath was made for this, so of course your code will be "better" if you use it.
I can't recommend a specific c++ XPath library, but even though using one will be the correct decision most of the time, do a cost/benefit analysis before adding one. Maybe YAGNI.
此 XPath 表达式:
选择第 $pN 个
sig
freq 和level
对) > XML 文档顶部元素的子元素。字符串
$pN
应替换为特定的正整数,例如:选择这两个元素:
使用 XPath 表达式作为这显然比提供的 C++ 代码更短且易于理解。
另一个优点是,可以在 C# 或 Java 或 ... 程序中使用相同的 XPath 表达式,而无需以任何方式修改它 - 因此遵守 XPath 会带来非常高的可移植性。
This XPath expression:
selects all children elements (just the pair
freq
andlevel
) of the $pN-thsig
child of the top element of the XML document.The string
$pN
should be substituted with a specific positive integer, for example:selects these two elements:
Using an XPath expression as this is obviously much shorter and understandable that the provided C++ code.
Another advantage is that the same XPath expression can be used from a C# or Java or ... program, without having to modify it in any way -- thus adhering to XPath results in very high degree of portability.