我可以从多个线程安全地使用 xpath 表达式查询 DOM 文档吗?
我计划在应用程序中使用 dom4j DOM Document 作为静态缓存,其中多个线程可以查询文档。 考虑到文档本身永远不会改变,从多个线程查询它是否安全?
我写了下面的代码来测试它,但我不确定它是否真的证明操作是安全的?
package test.concurrent_dom;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import org.dom4j.Node;
/**
* Hello world!
*
*/
public class App extends Thread
{
private static final String xml =
"<Session>"
+ "<child1 attribute1=\"attribute1value\" attribute2=\"attribute2value\">"
+ "ChildText1</child1>"
+ "<child2 attribute1=\"attribute1value\" attribute2=\"attribute2value\">"
+ "ChildText2</child2>"
+ "<child3 attribute1=\"attribute1value\" attribute2=\"attribute2value\">"
+ "ChildText3</child3>"
+ "</Session>";
private static Document document;
private static Element root;
public static void main( String[] args ) throws DocumentException
{
document = DocumentHelper.parseText(xml);
root = document.getRootElement();
Thread t1 = new Thread(){
public void run(){
while(true){
try {
sleep(3);
} catch (InterruptedException e) {
e.printStackTrace();
}
Node n1 = root.selectSingleNode("/Session/child1");
if(!n1.getText().equals("ChildText1")){
System.out.println("WRONG!");
}
}
}
};
Thread t2 = new Thread(){
public void run(){
while(true){
try {
sleep(3);
} catch (InterruptedException e) {
e.printStackTrace();
}
Node n1 = root.selectSingleNode("/Session/child2");
if(!n1.getText().equals("ChildText2")){
System.out.println("WRONG!");
}
}
}
};
Thread t3 = new Thread(){
public void run(){
while(true){
try {
sleep(3);
} catch (InterruptedException e) {
e.printStackTrace();
}
Node n1 = root.selectSingleNode("/Session/child3");
if(!n1.getText().equals("ChildText3")){
System.out.println("WRONG!");
}
}
}
};
t1.start();
t2.start();
t3.start();
System.out.println( "Hello World!" );
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
http://xerces.apache.org/xerces2-j/faq-dom.html 说
如果不查看实现,就不可能知道 selectSingleNode 是否使用任何共享状态来读取 DOM。我认为假设它不是线程安全的是最安全的。
另一种方法是使用您自己的 XPath 处理器,例如 Jaxen,它是线程安全的。
JAXEN Jira 对线程安全有各种修复问题,提供了 Jaxen 被设计为线程安全的证据。这是我偶然遇到的一个。
并且确认 Jaxen 是线程安全的作者之一。
除了线程安全之外,Jaxen 与模型无关 - 它可以与许多模型(W3C DOM、XOM、Dom4J、JDOM)配合使用,并且可以通过实现几个接口来插入自定义模型。
我认为 W3C DOM 上的简单访问器和迭代器是线程安全的。但这只是一种预感,并非具体事实。如果您想 100% 确定,请使用专为线程安全而设计的 DOM,例如 dom4j。
一些入门资源:
- 使用 Jaxen 的示例。
- Jaxen 常见问题解答 和 主页
http://xerces.apache.org/xerces2-j/faq-dom.html says
Without seeing the implementation, it's impossible to know if
selectSingleNode
uses any shared state for reading the DOM. I think it's safest to assume that it's not thread-safe.An alternative is to use your own XPath processor, such as Jaxen, which is thread-safe.
The JAXEN Jira has various fixes for thread-safe issues, providing evidence that Jaxen is designed to be thread-safe. This is one I came across by chance.
And confirmation that Jaxen is thread-safe from one of the authors.
As well as being thread-safe, Jaxen is model-agnostic - it works with many models (W3C DOM, XOM, Dom4J, JDOM) and custom models can be plugged in by implementing a couple of interfaces.
I would imagine that simple accessors and iterators on the W3C DOM are thread safe. But this is just a hunch, and not a concrete fact. If you want to be 100% sure, then use a DOM that is designed for thread-saftey, for example, dom4j.
Some resources to get started:
- An example of using Jaxen.
- Jaxen FAQ and homepage
我实际上对 dom4j DOM 并不熟悉,但如果你不确定它是否可以正确处理只读数据,我不确定它有多好。
我将做出操作假设,即您的可运行对象的可执行部分(睡眠后的部分)花费的时间不到一微秒,并且在您的测试运行中它们是连续发生的,而不是同时发生的。因此你的测试并不能真正证明任何事情。
为了进行更稳健的测试,我
添加原始冲突检测的机会就越大
}
那么
如果它按预测工作(这是未编译和未经测试的),那么它将继续生成新线程,新线程将尝试读取文档。他们将报告是否与另一个线程可能发生时间冲突。如果读取到错误值,他们会报告。它会不断生成新线程,直到你的系统耗尽资源,然后它就会崩溃。
I am actually not familiar with dom4j DOM but if you are not sure it can properly handle read-only data, I am not sure how good it is.
I will make the operational assumption that the executable part of your runnables (the part after the sleep) takes less than one microsecond and in your test run they happened consecutively, not concurrently. Thus your test does not really prove anything.
For a more robust test, I
added primitive conflict detection
}
then
If it works as predicted (this is uncompiled and untested) then it will keep generating new threads, the new threads will try to read the document. They will report if they potentially time conflict with another thread. They will report if they read a wrong value. It will keep generating new threads until your system runs out of resources, then it will crash.