用于最快查找的 XML 结构
当在下面的 xml 中查找某个“资源”时,以下结构中的一个或另一个更快吗?
示例 1。
<root>
<resource key="res_test_1" value="test"/>
<resource key="res_test_2" value="test 2"/>
<resource key="res_test_3" value="test 3"/>
</root>
示例 2。
<root>
<res_test_1>test</res_test_1>
<res_test_2>test 2</res_test_2>
<res_test_3>test 3</res_test_3>
</root>
“键”始终是有效的 XML 元素名称。
我问的是,因为这组资源键/值将成为 xml 文件的一部分,该文件将由 XSL 处理,将 XML 中的某些“键”替换为同一 XML 文件的资源部分中的值...我想为所需的查找尽可能优化地构建资源部分。 我使用 C# 和 XslCompiledTransform 对象来运行转换。
我的纯粹直觉表明,当对象模型是实际元素名称时,对象模型可能会更快地找到键,但我没有找到关于此类问题的建议。也许考虑这个问题并不重要,因为在转换过程中整个 xml 文档都将保存在内存中。
编辑(从此处和向下添加更多信息): 正如我已经指出的,这个问题可能是理论上的(关注几毫秒是不相关的),但输入这个问题的原因是为了准确了解我所问的问题 - 一种方式比另一种方式快(所列出的两个示例中的),当涉及到在 XML 结构中定位数据时。无论出于何种原因,其中一种方式是首选方式。
在我看来,第一个示例需要处理器进行更多“工作”,以便在请求时定位并返回值。
这是示例 1 的示例 XPath: /root/resource[@key="res_test_2"]/@value
示例 2 对应的 XPath: /root/res_test_2
此外,示例 2 的结构需要更少的空间,这将缩短加载时间,如下面的答案之一所示。一个很好的观点,至少对于非常大的文档来说是这样。
当我想到这一点时:示例 2 的一个明显缺点是 XSD 模式没有多大用处,因为 XML 的这一部分将具有动态元素名称..这可能是放置所有值的建议在属性(参见下面的答案)中是关于的。
我制作了这些 XPath 示例,因为它们很容易演示。我之前写过的 XSL 转换中也需要类似的查找,但这个问题的重点应该是文档的结构,作为一个更通用的问题。
谢谢, 安德烈亚斯
Is one or the other of the following structures faster, when it comes to looking up a certain "resource" in the xml below?
Sample 1.
<root>
<resource key="res_test_1" value="test"/>
<resource key="res_test_2" value="test 2"/>
<resource key="res_test_3" value="test 3"/>
</root>
Sample 2.
<root>
<res_test_1>test</res_test_1>
<res_test_2>test 2</res_test_2>
<res_test_3>test 3</res_test_3>
</root>
The "keys" are always valid XML element names.
I'm asking since this set of resource key / values will be part of an xml file, that will be processed by XSL, replacing certain "keys" in the XML with the values from the resource part of the same XML file... and I would like to structure the resource part as optimal as possible for the lookups that will be needed.
I'm using C# and the XslCompiledTransform object for running the transform.
My pure instinct says that the object model might get faster to the keys when they are the actual element names, but I find no advice regarding this kind of question. Perhaps it's unimportant to think about this issue, since the whole xml document will be in memory during the transform.
Edit (adding more info from here and down):
As I've already indicated, this question might be theoretical (focusing on a few milliseconds is not relevant), but the reason for entering this question was to get an opinion on exactly what I'm asking - is one way faster than the other (of the two samples laid out), when it comes to locating data in an XML structure. Is one or the other the preferred way, for any reason.
As I see it, the first sample needs to involve more "work" for a processor, for locating and returning the value, when asking for it.
This a sample XPath for Sample 1:
/root/resource[@key="res_test_2"]/@value
Corresponding XPath for Sample 2:
/root/res_test_2
Also, the structure of sample 2 requires less space, which will improve load time, as indicated by one of the answers below. A good point, at least for very large documents.
When I come to think of it: An obvious downside with sample 2 would be that an XSD schema would not be of much use, since this part of the XML would have dynamic element names.. which might be what the advice to put all values in attributes (se answer below) was about.
I made these XPath samples since they are easy to demonstrate. A similar lookup will be needed in the XSL transform that I wrote about earlier, but the focus of this question should be the structure of the document, as a more generic question.
Thanks,
Andreas
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不久前,我询问了有关 XSLT 性能的问题,得到了以下答案:
使用属性而不是元素可以提高性能。执行 XPath 匹配时,属性速度更快,因为它们是松散类型的。这使得模式验证变得更加容易。
(请参阅此问题)
A short while ago I've asked something about XSLT performance and I got the following answer:
Using attributes instead of elements improves the performance. When performing XPath matches, attributes are faster because they are loosely typed. This makes validation of the schema easier.
(See this question)
在sample1和sample2之间,唯一的区别是..您正在将元素转换为属性..读取子属性将花费与读取子元素相同的精力..
示例:
用于从第一个示例中读取“something”的Xpath是
/root/child/@id/.
和用于读取/root/child/id/.
的 Xpath ..区别不大 .. 但如果你看看大小.. example2 稍大.. 现在假设您有一个这样的节点的巨大列表.. 那么 example2 文件将比 example1 更大..
所以 example2 数据的权重很高
回到你的例子..如果你看看结构..sample1看起来比sample2更长..
假设相同的文件有巨大的数字具有各自层次结构的数据..
如果您尝试使用 C# 代码读取sample1和sample2..该代码将花费更多时间来加载sample1(由于其大小)..与处理速度相比(我的意思是该过程阅读的节点)将被忽略。
@OP,正如你所知..
Sample1 肯定会下降 1 级.. 与 Sample2 相比.. 但正如我之前提到的 .. 我观察到这不会对解析器产生太大影响,我已经解释了大小对 的影响正在读取文件 ..有件事我想让你知道。
使用属性应该是明智的选择,这不是规则,但我们通常使用属性作为元数据..
示例:
如果您查看上面的示例 XML,属性即“ID”用作有关子节点数据的元数据,Id 不是数据,它只是一条子消息。
再举一个例子:
上面的例子只不过是一段 HTML 代码:) 其中属性 Class 的值为“style1”..然后在 CSS 文件中使用该类名称将属性和样式添加到 文本中标签
Between sample1 and sample2, the only difference is .. you are converting element to attribute .. well reading a child attribute would cost same effort as reading a child element ..
example:
The Xpath for reading "somthing" from first example is
/root/child/@id/.
and Xpath for reading/root/child/id/.
..which is not so big deal of difference .. but if you look at the size .. example2 is slightly big .. now assume that you have a huge list of such nodes .. then example2 file would be bulkier than example1 ..
So example2 data is weighing high
coming back to your examples .. If you look at the structure .. sample1 looks more lengthier than sample2 ..
Assume that the same files having huge number of data with respective hierarchy ..
if you try to read sample1 and sample2 using C# code .. the code would take more time to load sample1 (due to its size) .. compared to that processing speed (I mean the process of reading nodes) would be ignorable.
@OP, As you have know ..
Sample1 certainly goes 1 level down .. compared to Sample2 .. but as I mentioned earlier .. I have observed that this won't make much difference to the parser, I have already explained about effects of size over reading of file .. There is something I would like to let you know.
Using attributes should be wise choice, its not a rule but we usually use attributes as metadata ..
Example:
If you look at the above sample XML, attribute- viz, "ID" is used as metadata about data of child node, Id isn't a data it is just a sub-message.
Take another example:
Above example is nothing but an HTML code :) Where attribute Class is having a value "style1" .. this class name is then used in CSS file to add property and styles to the text under tag