将 org.w3c.dom.NodeList 转换为 Clojure ISeq

发布于 2024-11-05 03:30:09 字数 690 浏览 0 评论 0原文

我正在尝试获取新的 defprotocolreify 等的句柄。

我有一个从XPath 调用,我想将其“转换”为 ISeq。

在Scala中,我实现了一个隐式转换方法:

implicit def nodeList2Traversable(nodeList: NodeList): Traversable[Node] = {
  new Traversable[Node] {
    def foreach[A](process: (Node) => A) {
      for (index <- 0 until nodeList.getLength) {
        process(nodeList.item(index))
      }
    }
  }
}

NodeList包括方法int getLength()Node item(int index)

我如何在 Clojure 中做同样的事情?我预计我需要使用 defprotocol。我需要定义哪些函数来创建 seq

如果我使用looprecur对列表进行简单、天真的转换,我最终会得到一个非惰性结构。

I am trying to get a handle on the new defprotocol, reify, etc.

I have a org.w3c.dom.NodeList returned from an XPath call and I would like to "convert" it to an ISeq.

In Scala, I implemented an implicit conversion method:

implicit def nodeList2Traversable(nodeList: NodeList): Traversable[Node] = {
  new Traversable[Node] {
    def foreach[A](process: (Node) => A) {
      for (index <- 0 until nodeList.getLength) {
        process(nodeList.item(index))
      }
    }
  }
}

NodeList includes methods int getLength() and Node item(int index).

How do I do the equivalent in Clojure? I expect that I will need to use defprotocol. What functions do I need to define to create a seq?

If I do a simple, naive, conversion to a list using loop and recur, I will end up with a non-lazy structure.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

冷…雨湿花 2024-11-12 03:30:09

大多数 Clojure 的序列处理函数都会返回惰性序列,包括 maprange 函数:

(defn node-list-seq [^org.w3c.dom.NodeList node-list]
  (map (fn [index] (.item node-list index))
       (range (.getLength node-list))))

请注意,上面的 NodeList 类型提示不是必需的,但可以提高性能。

现在您可以像这样使用该函数:

(map #(.getLocalName %) (node-list-seq your-node-list))

Most of Clojure's sequence-processing functions return lazy seqs, include the map and range functions:

(defn node-list-seq [^org.w3c.dom.NodeList node-list]
  (map (fn [index] (.item node-list index))
       (range (.getLength node-list))))

Note the type hint for NodeList above isn't necessary, but improves performance.

Now you can use that function like so:

(map #(.getLocalName %) (node-list-seq your-node-list))
空城缀染半城烟沙 2024-11-12 03:30:09

使用进行理解,这些会产生惰性序列。

这是适合您的代码。我花了一些时间让它可以在命令行上运行;您只需替换已解析的 XML 文件的名称。

注意事项 1:避免定义变量。请改用局部变量。

注意事项 2:这是用于 XML 的 Java API,因此对象是可变的;由于您有一个惰性序列,因此如果在迭代时可变 DOM 树发生任何更改,您可能会遇到令人不快的竞争更改。

警告 3:即使这是一个惰性结构,整个 DOM 树已经在内存中了(不过,我不太确定最后一条评论。我认为 API 试图推迟读取内存中的树直到需要时,但不能保证)。因此,如果您在处理大型 XML 文档时遇到麻烦,请尽量避免使用 DOM 方法。

(require ['clojure.java.io :as 'io])
(import [javax.xml.parsers DocumentBuilderFactory])
(import [org.xml.sax InputSource])

(def dbf (DocumentBuilderFactory/newInstance))
(doto dbf
  (.setValidating false)
  (.setNamespaceAware true)
  (.setIgnoringElementContentWhitespace true))
(def builder (.newDocumentBuilder dbf))
(def doc (.parse builder (InputSource. (io/reader "C:/workspace/myproject/pom.xml"))))

(defn lazy-child-list [element]
  (let [nodelist (.getChildNodes element)
        len (.getLength nodelist)]
    (for [i (range len)]
      (.item nodelist i))))

;; To print the children of an element
(-> doc
    (.getDocumentElement)
    (lazy-child-list)
    (println))

;; Prints clojure.lang.LazySeq
(-> doc
    (.getDocumentElement)
    (lazy-child-list)
    (class)
    (println))

Use a for comprehension, these yield lazy sequences.

Here's the code for you. I've taken the time to make it runnable on the command line; you only need to replace the name of the parsed XML file.

Caveat 1: avoid def-ing your variables. Use local variables instead.

Caveat 2: this is the Java API for XML, so there objects are mutable; since you have a lazy sequence, if any changes happen to the mutable DOM tree while you're iterating, you might have unpleasant race changes.

Caveat 3: even though this is a lazy structure, the whole DOM tree is already in memory anyway (I'm not really sure about this last comment, though. I think the API tries to defer reading the tree in memory until needed, but, no guarantees). So if you run into trouble with big XML documents, try to avoid the DOM approach.

(require ['clojure.java.io :as 'io])
(import [javax.xml.parsers DocumentBuilderFactory])
(import [org.xml.sax InputSource])

(def dbf (DocumentBuilderFactory/newInstance))
(doto dbf
  (.setValidating false)
  (.setNamespaceAware true)
  (.setIgnoringElementContentWhitespace true))
(def builder (.newDocumentBuilder dbf))
(def doc (.parse builder (InputSource. (io/reader "C:/workspace/myproject/pom.xml"))))

(defn lazy-child-list [element]
  (let [nodelist (.getChildNodes element)
        len (.getLength nodelist)]
    (for [i (range len)]
      (.item nodelist i))))

;; To print the children of an element
(-> doc
    (.getDocumentElement)
    (lazy-child-list)
    (println))

;; Prints clojure.lang.LazySeq
(-> doc
    (.getDocumentElement)
    (lazy-child-list)
    (class)
    (println))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文