如何初始化和“修改” Scala 中的循环持久数据结构?

发布于 2024-11-02 04:20:25 字数 567 浏览 7 评论 0原文

我搜索并找到了有关该主题的一些信息,但答案要么令人困惑,要么不适用。

我有这样的事情:

class Thing (val name:String, val refs:IndexedSeq[Ref])
class Ref (val name:String, val thing:Thing)

现在,我想说,加载一个文件,解析它并从中填充这个数据结构。它是不可变的和循环的,我们该如何做到这一点呢?

另外,假设我确实填充了此数据结构,现在我想修改它,例如更改 rootThing.refs(3).name,如何做到这一点?


感谢您在这里发布的想法。在这一点上,我在想,如果一个人真的想要像这样的持久数据结构,就应该跳出框框思考,并考虑客户端代码需要提出哪些问题。因此,不要考虑对象和字段,而应该考虑查询、索引等。首先,我在考虑以下方面: 是否存在双向多图持久数据结构?

I have searched and found some info on this topic but the answers are either confusing or not applicable.

I have something like this:

class Thing (val name:String, val refs:IndexedSeq[Ref])
class Ref (val name:String, val thing:Thing)

Now, I want to say, load in a file, parse it and populate this data structure from it. It being immutable and cyclic, how might one do so?

Also, let's say I do get this data structure populated, now I want to modify it, like change rootThing.refs(3).name, how might one do that?


Thanks for the ideas posted here. At this point, I'm thinking that if one really wants persistent data structures for something like this, to think outside the box and consider what questions client code will need to ask. So instead of thinking of objects and fields, think of queries, indexes and such. To start with, I'm thinking in terms of:
Is there a bidirectional multimap persistent data structure?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

ペ泪落弦音 2024-11-09 04:20:25

如果您准备修改它以引入一定程度的惰性,则可以初始化这种形式的循环数据结构,

scala> class Thing (val name:String, refs0: => IndexedSeq[Ref]) { lazy val refs = refs0 } ; class Ref (val name:String, thing0: => Thing) { lazy val thing = thing0 }
defined class Thing
defined class Ref

scala> val names = Vector("foo", "bar", "baz")                                                                                                                       
names: scala.collection.immutable.Vector[java.lang.String] = Vector(foo, bar, baz)

scala> val rootThing : Thing = new Thing("root", names.map { new Ref(_, rootThing) })
rootThing: Thing = Thing@1f7dab1

scala> rootThing.refs(1).name
res0: String = bar

但是,您不能使其持久化:作为循环,任何更改都可以通过结构的每个元素可见,因此版本之间没有共享的机会。

You can initialize a cyclic data structure of this form if you're prepared to modify it to introduce a degree of laziness,

scala> class Thing (val name:String, refs0: => IndexedSeq[Ref]) { lazy val refs = refs0 } ; class Ref (val name:String, thing0: => Thing) { lazy val thing = thing0 }
defined class Thing
defined class Ref

scala> val names = Vector("foo", "bar", "baz")                                                                                                                       
names: scala.collection.immutable.Vector[java.lang.String] = Vector(foo, bar, baz)

scala> val rootThing : Thing = new Thing("root", names.map { new Ref(_, rootThing) })
rootThing: Thing = Thing@1f7dab1

scala> rootThing.refs(1).name
res0: String = bar

However, you can't make it persistent: being cyclic, any change is visible via every element of the structure, so there are no opportunities for sharing between versions.

意犹 2024-11-09 04:20:25

对于单个循环引用,您可以使用惰性:

lazy val t: Thing = new Thing("thing", Vector(new Ref("ref", t)))

但是显然,对于多对多连接,这会变得复杂。

我不知道是否存在通用的纯功能循环图数据结构。对于非循环图,这很容易,因为您可以对其进行拓扑排序,然后逐步初始化它。

也许使用间接是您的一个选择,比如通过标识符而不是实际的 scala 对象引用来引用对象?

case class ThingByID(id: Int, name: String, refs: IndexedSeq[RefByID])
case class RefByID(name: String, thingID: Int)

然后,您可以在加载文件后通过 ID 将事物收集到不可变映射中(例如 collection.immutable.IntMap),并在来自引用时查找它们。

编辑

关于lazy val t的第一种情况,迈尔斯是正确的。事实上,您需要像他的回答中那样的名称参数。

class Thing(val name: String, val refs: IndexedSeq[Ref])
class Ref(val name: String, _thing: => Thing) { def thing = _thing }

val t: Thing = new Thing("thing", Vector(new Ref("ref", t)))

For a single cyclic reference, you can use lazy:

lazy val t: Thing = new Thing("thing", Vector(new Ref("ref", t)))

However obviously this gets complicated with many-to-many connections.

I don't know if a general purpose purely functional cyclic graph data structure exists. With acyclic graphs this would be easy as you could topologically sort it and then initialize it step by step.

Maybe using an indirection is an option for you, say to refer to objects through an identifier instead of the actual scala object reference?

case class ThingByID(id: Int, name: String, refs: IndexedSeq[RefByID])
case class RefByID(name: String, thingID: Int)

Then you could after loading your file collect the things by their ID into an immutable map (e.g. collection.immutable.IntMap) and look them up when coming from a ref.

EDIT

Miles is right about the first case of the lazy val t. Indeed you need by-name parameters as in his answer.

class Thing(val name: String, val refs: IndexedSeq[Ref])
class Ref(val name: String, _thing: => Thing) { def thing = _thing }

val t: Thing = new Thing("thing", Vector(new Ref("ref", t)))
苏辞 2024-11-09 04:20:25

有一种替代方法需要重新考虑如何表示对象关联:不是将对象之间的关联存储在对象本身内部(通常在 OO 代码中完成),而是将它们添加到单独的层中作为映射。

由于对象图中的所有节点在创建关联时都已存在,因此可以使用映射轻松创建不可变的双向关联。

scala> class Thing (val name:String)
defined class Thing

scala> class Ref (val name:String)
defined class Ref

scala> new Thing("Thing1")
res0: Thing = Thing@5c2bae98

scala> new Ref("Ref1")
res1: Ref = Ref@7656acfa       

scala> val thing2Ref = Map(res0 -> res1)
thing2Ref: scala.collection.immutable.Map[Thing,Ref] = Map(Thing@5c2bae98 -> Ref
@7656acfa)

scala> val ref2Thing = Map(res1 -> res0)
ref2Thing: scala.collection.immutable.Map[Ref,Thing] = Map(Ref@7656acfa -> Thing
@5c2bae98)

如果您仔细想想,这种方法类似于关系数据库的工作方式。表之间的多值关联并不存储在行本身中,而是存储在单独的索引中。即使不存在关联索引并且使用表扫描解析查询,它也会使用主键索引来查找结果的所有候选行。

这种风格的优点是可以添加或更改关联而不影响对象本身,并且可以针对不同的受众/用例采用不同的关联。缺点是关联映射在关联开始的实例上无法直接访问;它必须被传递&单独提供。

There's an alternative approach which requires rethinking how object associations are represented: instead of storing associations between objects inside the objects themselves (as typically done in OO code) add them afterwards in a separate layer as Maps.

Because all of the nodes in the object graph exist by the time associations are created, immutable bidrectional associations can be easily created using Maps.

scala> class Thing (val name:String)
defined class Thing

scala> class Ref (val name:String)
defined class Ref

scala> new Thing("Thing1")
res0: Thing = Thing@5c2bae98

scala> new Ref("Ref1")
res1: Ref = Ref@7656acfa       

scala> val thing2Ref = Map(res0 -> res1)
thing2Ref: scala.collection.immutable.Map[Thing,Ref] = Map(Thing@5c2bae98 -> Ref
@7656acfa)

scala> val ref2Thing = Map(res1 -> res0)
ref2Thing: scala.collection.immutable.Map[Ref,Thing] = Map(Ref@7656acfa -> Thing
@5c2bae98)

If you think about it, this approach is similar to how relational databases work. Multi-valued associations between tables are not stored in the rows themselves, but in separate indexes. Even when association indexes are not present and so a query is resolved using a table scan, it is using the primary key indexes to locate all candidate rows for the result.

An advantage of this style is that associations can be added or changed without affecting the objects themselves, and different associations can be employed for different audiences/use-cases. A disadvantage is that an association map is not directly accessible on instances where the associations begins; it has to be passed around & provided separately.

月下凄凉 2024-11-09 04:20:25

不可变数据结构可以完全由其构造函数初始化,或者您可以接受在更改其属性时不断复制结构的需要。因此,为了回答问题的第一部分,您可以通过定义接受数据中所有信息的构造函数将数据加载到不可变数据结构中,或者确保您了解循环子图:

循环数据结构不一定完全是我认为是循环的。如果您想象单个实例/状态所保存的指针网络,则可能有一个子图,其中包含彼此指向的父级和子级,但没有其他循环结构。在这种情况下,复制实例 1 以延迟创建具有不同父节点(例如)的实例 2 将需要复制父节点和子节点,因为它们形成循环子图。但是,除父级之外的子级中保存的引用可以继续是对与第一个实例相同的不可变结构的引用。

例如,我的“房屋”类引用了“门”、“窗”和“屋顶”。门有颜色和对房屋的 toHouse 引用,窗户有大小,屋顶有坡度。所以我创建了 bobsHouse,有一个绿色的门、一个大窗户和一个平坦的屋顶。事实上,由于所有这些都是不可变的,因此理论上只有一个大窗户 - 所有具有大窗户的房屋都有相同的窗户。第二个实例 janesHouse 与 bobsHouse 类似,但具有山形屋顶。因此,如果我说 janesHouse = bobsHouse.withRoof(gabled),那么我应该得到一个新的 House 实例,带有一个新的(也是绿色的)门和一个新的(山墙式的)屋顶,但具有相同的窗口。

因此,如果 janesHouse 被延迟评估,则仅当引用 Door 或 Roof 时才需要创建一个新 House。如果请求 janesHouse.Window,它根本不需要创建一个新的 House - 只需要 bosHouse。

tl;dr:您可以拥有持久(惰性)循环数据结构,但前提是您可以在其中找到非循环子图,即它不是链。

Immutable data structures can be initialised entirely by their constructor, or you can accept a need to keep copying the structure as you change its properties. So to answer the first part of the question, you load data into the immutable data structure by defining a constructor that accepts all the information in your datum, or ensure you're aware of the cyclic subgraphs:

Cyclic data structures aren't necessarily entirely cyclic, I think. If you imagine the network of pointers that a single instance/state holds, you could have a subgraph containing a parent and child that point to each other, but no other cyclic structures. In this scenario, copying instance 1 to lazily create instance 2 with a different parent node (for example) would necessitate copying the parent and child nodes, as they form a cyclic subgraph. But the references held within the child other than the parent can continue to be references to the same immutable structures as the first instance.

For example, my class House has a reference to a Door, a Window and a Roof. A Door has a colour and a toHouse reference to the House, a Window has a size and a Roof has a pitch. So I create bobsHouse with a green Door, a large Window and a flat Roof. In fact, since all of these are immutable, there is theoretically only one large Window - all houses with large Windows have the same Window. A second instance, janesHouse, is just like bobsHouse, but with the gabled Roof. So if I say janesHouse = bobsHouse.withRoof(gabled), then I should get a new instance of House, with a new (also green) Door, and a new (gabled) Roof, but with the same Window.

So if janesHouse is evaluated lazily, it need only create a new House if the Door or Roof are referenced. If janesHouse.Window is requested, it need not create a new House at all - only bobsHouse is needed.

tl;dr: You can have persistent (lazy) cyclic data structures, but only if you can find non-cyclic subgraphs in it, i.e. it's not a chain.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文