查找节点中的关键字并获取 DOM 中的节点名称
我想在 DOM 中搜索特定关键字,当找到它时,我想知道它来自树中的哪个节点。
static void search(String segment, String keyword) {
if (segment == null)
return;
Pattern p=Pattern.compile(keyword,Pattern.CASE_INSENSITIVE);
StringBuffer test=new StringBuffer (segment);
matcher=p.matcher(test);
if(!matcher.hitEnd()){
total++;
if(matcher.find())
//what to do here to get the node?
}
}
public static void traverse(Node node) {
if (node == null || node.getNodeName() == null)
return;
search(node.getNodeValue(), "java");
check(node.getFirstChild());
System.out.println(node.getNodeValue() != null &&
node.getNodeValue().trim().length() == 0 ? "" : node);
check(node.getNextSibling());
}
I want to search a DOM for a specific keyword, and when it is found, I want to know which Node in the tree it is from.
static void search(String segment, String keyword) {
if (segment == null)
return;
Pattern p=Pattern.compile(keyword,Pattern.CASE_INSENSITIVE);
StringBuffer test=new StringBuffer (segment);
matcher=p.matcher(test);
if(!matcher.hitEnd()){
total++;
if(matcher.find())
//what to do here to get the node?
}
}
public static void traverse(Node node) {
if (node == null || node.getNodeName() == null)
return;
search(node.getNodeValue(), "java");
check(node.getFirstChild());
System.out.println(node.getNodeValue() != null &&
node.getNodeValue().trim().length() == 0 ? "" : node);
check(node.getNextSibling());
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
考虑使用 XPath (API):
如果你想通过正则表达式进行更复杂的匹配,你可以提供自己的 <一个href="http://docs.oracle.com/javase/7/docs/api/javax/xml/xpath/XPath.html#setXPathFunctionResolver%28javax.xml.xpath.XPathFunctionResolver%29" rel="nofollow">函数解析器。
XPath 表达式分解
//*[contains(text(),$term)]
://*
星号选择任意元素;双斜杠表示任何父级[contains(text(),$term)]
是与文本匹配的谓词text()
是获取元素文本的函数$term
是一个变量;这可用于通过变量解析器解析术语“java”;解析器优先于字符串连接,以防止注入攻击(类似于 SQL 注入问题)contains(arg1,arg2)
是一个函数,如果 arg1 包含 arg2,则返回 trueXPathConstants.NODE
告诉 API 选择单个节点;您可以使用NODESET
以
NodeList
。Consider using XPath (API):
If you want to do more complex matching via regular expressions, you can provide your own function resolver.
Breakdown of the XPath expression
//*[contains(text(),$term)]
://*
the asterisk selects any element; the double-slash means any parent[contains(text(),$term)]
is a predicate that matches the texttext()
is a function that gets the element's text$term
is a variable; this can be used to resolve the term "java" via the variable resolver; a resolver is preferred to string concatenation to prevent injection attacks (similar to SQL injection issues)contains(arg1,arg2)
is a function that returns true if arg1 contains arg2XPathConstants.NODE
tells the API to select a single node; you could useNODESET
to get all matches as aNodeList
.