在 Clojure 中,何时应该使用记录或向量来表示异构节点类型的树?

发布于 2024-09-30 20:52:45 字数 506 浏览 1 评论 0原文

对于表示由不同节点类型组成的树,哪一种是更好的惯用 clojure 实践:

A. 从几种不同类型的记录构建树,使用 deftype 或 defrecord 定义:

(defrecord node_a [left right])
(defrecord node_b [left right])
(defrecord leaf [])

(def my-tree (node_a. (node_b. (leaf.) (leaf.)) (leaf.)))

B. 从向量构建树,并使用指定类型的关键字:

(def my-tree [:node-a [:node-b :leaf :leaf] :leaf])

我看到的大多数 clojure 代码似乎都倾向于使用通用数据结构(向量、地图等),而不是数据类型或记录。举个例子,Hiccup 使用向量 + 关键字的方法很好地表示了 html。

我们什么时候应该选择一种风格而不是另一种风格?

Which is better idiomatic clojure practice for representing a tree made up of different node types:

A. building trees out of several different types of records, that one defines using deftype or defrecord:

(defrecord node_a [left right])
(defrecord node_b [left right])
(defrecord leaf [])

(def my-tree (node_a. (node_b. (leaf.) (leaf.)) (leaf.)))

B. building trees out of vectors, with keywords designating the types:

(def my-tree [:node-a [:node-b :leaf :leaf] :leaf])

Most clojure code that I see seems to favor the usage of the general purpose data structures (vectors, maps, etc.), rather than datatypes or records. Hiccup, to take one example, represents html very nicely using the vector + keyword approach.

When should we prefer one style over the other?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

清泪尽 2024-10-07 20:52:45

您可以根据需要将任意数量的元素放入向量中。一条记录有一定数量的字段。如果你想限制你的节点只有 N 个子节点,记录可能会很好,例如,当一个二叉树时,一个节点必须只有一个左节点和右节点。但对于 HTML 或 XML 之类的内容,您可能希望支持任意数量的子节点。

使用向量和关键字意味着“扩展”支持的节点类型集就像将新关键字放入向量中一样简单。 [:frob "foo"] 在 Hiccup 中是可以的,即使它的作者从未听说过 frobing。使用记录,您可能必须为每个节点类型定义一个新记录。但随后您将获得捕获拼写错误和验证子节点的好处。 [:strnog "some bold text?"] 不会被 Hiccup 捕获,但 (Strnog. "foo") 将是一个编译时错误。

向量是 Clojure 的基本数据类型之一,您可以使用 Clojure 的内置函数来操作它们。想扩展你的树吗?只需conj到它上面,或者update-in,或者其他什么。您可以通过这种方式逐步构建您的树。对于记录,您可能会陷入构造函数调用的困境,否则您必须为构造函数编写大量包装函数。

看来这在一定程度上可以归结为动态与静态的争论。就我个人而言,我会采用动态(向量+关键字)路线,除非有特定需要使用记录的好处。以这种方式编码可能更容易,并且对用户来说更灵活,但代价是用户更容易最终陷入混乱。但 Clojure 用户可能习惯于定期处理危险武器。 Clojure 在很大程度上是一种动态语言,保持动态通常是正确的做法。

You can put as many elements into a vector as you want. A record has a set number of fields. If you want to constrain your nodes to only have N sub-nodes, records might be good, e.g. making when a binary tree, where a node has to have only a Left and Right. But for something like HTML or XML, you probably want to support arbitrary numbers of sub-nodes.

Using vectors and keywords means that "extending" the set of supported node types is as simple as putting a new keyword into the vector. [:frob "foo"] is OK in Hiccup even if its author never heard of frobbing. Using records, you'd potentially have to define a new record for every node type. But then you get the benefit of catching typos and verifying subnodes. [:strnog "some bold text?"] isn't going to be caught by Hiccup, but (Strnog. "foo") would be a compile-time error.

Vectors being one of Clojure's basic data types, you can use Clojure's built-in functions to manipulate them. Want to extend your tree? Just conj onto it, or update-in, or whatever. You can build up your tree incrementally this way. With records, you're probably stuck with constructor calls, or else you have to write a ton of wrapper functions for the constructors.

Seems like this partly boils down to an argument of dynamic vs. static. Personally, I would go the dynamic (vector + keyword) route unless there was a specific need for the benefits of using records. It's probably easier to code that way, and it's more flexible for the user, at the cost of being easier for the user to end up making a mess. But Clojure users are likely used to having to handle dangerous weapons on a regular basis. Clojure being largely a dynamic language, staying dynamic is often the right thing to do.

想念有你 2024-10-07 20:52:45

这是一个好问题。我认为两者都适合不同类型的问题。如果每个节点可以包含一组可变的信息,那么嵌套向量是一个很好的解决方案 - 特别是模板系统将运行良好。对于嵌套受到更多限制的少量固定节点类型来说,记录是一个很好的解决方案。

我们对异构记录树做了很多工作。每个节点代表少数众所周知的类型之一,每个节点都有一组不同的已知固定键。在这种情况下记录更好的原因是您可以通过键从节点中选取数据,这是 O(1) (实际上是非常快的 Java 方法调用),而不是 O(n) (您必须查看通过节点内容)并且通常也更容易访问。

恕我直言,1.2 中的记录还没有完全“完成”,但是您自己构建这些东西非常容易。我们有一个 defrecord2 添加构造函数 (new-foo)、字段验证、打印支持、pprint 支持、通过拉链的树行走/编辑支持等。

我们使用它的一个例子是表示 AST 或执行计划,其中节点可能是诸如 Join、Sort 等之类的东西。

向量会更好用于创建字符串之类的东西,其中每个节点中可以放置任意数量的东西。如果您可以在

中填充 1+ 个

,那么您就无法创建包含 :p 字段的记录 - 这没有任何意义。在这种情况下,向量更加灵活和惯用。

This is a good question. I think both are appropriate for different kinds of problems. Nested vectors are a good solution if each node can contain a variable set of information - in particular templating systems are going to work well. Records are a good solution for a smallish number of fixed node types where nesting is far more constrained.

We do a lot of work with heterogeneous trees of records. Each node represents one of a handful of well-known types, each with a different set of known fixed keys. The reason records are better in this case is that you can pick the data out of the node by key which is O(1) (really a Java method call which is very fast), not O(n) (where you have to look through the node contents) and also generally easier to access.

Records in 1.2 are imho not quite "finished" but it's pretty easy to build that stuff yourself. We have a defrecord2 that adds constructor functions (new-foo), field validation, print support, pprint support, tree walk/edit support via zippers, etc.

An example of where we use this is to represent ASTs or execution plans where nodes might be things like Join, Sort, etc.

Vectors are going to be better for creating stuff like strings where an arbitrary number of things can be put in each node. If you can stuff 1+ <p>s inside a <div>, then you can't create a record that contains a :p field - that just doesn't make any sense. That's a case where vectors are far more flexible and idiomatic.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文