查找节点中的关键字并获取 DOM 中的节点名称

发布于 2024-12-17 09:25:31 字数 806 浏览 0 评论 0原文

我想在 DOM 中搜索特定关键字,当找到它时,我想知道它来自树中的哪个节点。

static void search(String segment, String keyword) {

    if (segment == null)
        return;

    Pattern p=Pattern.compile(keyword,Pattern.CASE_INSENSITIVE);
    StringBuffer test=new StringBuffer (segment);
    matcher=p.matcher(test);

    if(!matcher.hitEnd()){        
        total++;
        if(matcher.find())
        //what to do here to get the node?
    }
}

public static void traverse(Node node) {
    if (node == null || node.getNodeName() == null)
        return;

    search(node.getNodeValue(), "java");

    check(node.getFirstChild());

    System.out.println(node.getNodeValue() != null && 
                       node.getNodeValue().trim().length() == 0 ? "" : node);
    check(node.getNextSibling());
}

I want to search a DOM for a specific keyword, and when it is found, I want to know which Node in the tree it is from.

static void search(String segment, String keyword) {

    if (segment == null)
        return;

    Pattern p=Pattern.compile(keyword,Pattern.CASE_INSENSITIVE);
    StringBuffer test=new StringBuffer (segment);
    matcher=p.matcher(test);

    if(!matcher.hitEnd()){        
        total++;
        if(matcher.find())
        //what to do here to get the node?
    }
}

public static void traverse(Node node) {
    if (node == null || node.getNodeName() == null)
        return;

    search(node.getNodeValue(), "java");

    check(node.getFirstChild());

    System.out.println(node.getNodeValue() != null && 
                       node.getNodeValue().trim().length() == 0 ? "" : node);
    check(node.getNextSibling());
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

跨年 2024-12-24 09:25:31

考虑使用 XPath (API):

// the XML & search term
String xml = "<foo>" + "<bar>" + "xml java xpath" + "</bar>" + "</foo>";
InputSource src = new InputSource(new StringReader(xml));
final String term = "java";
// search expression and term variable resolver
String expression = "//*[contains(text(),$term)]";
final QName termVariableName = new QName("term");
class TermResolver implements XPathVariableResolver {
  @Override
  public Object resolveVariable(QName variableName) {
    return termVariableName.equals(variableName) ? term : null;
  }
}
// perform the search
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setXPathVariableResolver(new TermResolver());
Node node = (Node) xpath.evaluate(expression, src, XPathConstants.NODE);

如果你想通过正则表达式进行更复杂的匹配,你可以提供自己的 <一个href="http://docs.oracle.com/javase/7/docs/api/javax/xml/xpath/XPath.html#setXPathFunctionResolver%28javax.xml.xpath.XPathFunctionResolver%29" rel="nofollow">函数解析器。

XPath 表达式分解 //*[contains(text(),$term)]

  • //* 星号选择任意元素;双斜杠表示任何父级
  • [contains(text(),$term)] 是与文本匹配的谓词
  • text() 是获取元素文本的函数
  • $term 是一个变量;这可用于通过变量解析器解析术语“java”;解析器优先于字符串连接,以防止注入攻击(类似于 SQL 注入问题)
  • contains(arg1,arg2) 是一个函数,如果 arg1 包含 arg2,则返回 true

XPathConstants.NODE 告诉 API 选择单个节点;您可以使用 NODESETNodeList

Consider using XPath (API):

// the XML & search term
String xml = "<foo>" + "<bar>" + "xml java xpath" + "</bar>" + "</foo>";
InputSource src = new InputSource(new StringReader(xml));
final String term = "java";
// search expression and term variable resolver
String expression = "//*[contains(text(),$term)]";
final QName termVariableName = new QName("term");
class TermResolver implements XPathVariableResolver {
  @Override
  public Object resolveVariable(QName variableName) {
    return termVariableName.equals(variableName) ? term : null;
  }
}
// perform the search
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setXPathVariableResolver(new TermResolver());
Node node = (Node) xpath.evaluate(expression, src, XPathConstants.NODE);

If you want to do more complex matching via regular expressions, you can provide your own function resolver.

Breakdown of the XPath expression //*[contains(text(),$term)]:

  • //* the asterisk selects any element; the double-slash means any parent
  • [contains(text(),$term)] is a predicate that matches the text
  • text() is a function that gets the element's text
  • $term is a variable; this can be used to resolve the term "java" via the variable resolver; a resolver is preferred to string concatenation to prevent injection attacks (similar to SQL injection issues)
  • contains(arg1,arg2) is a function that returns true if arg1 contains arg2

XPathConstants.NODE tells the API to select a single node; you could use NODESET to get all matches as a NodeList.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文