Haskell 可变映射/树
我正在 Haskell 中寻找可变(平衡)树/映射/哈希表或如何在函数内模拟它的方法。即,当我多次调用同一函数时,结构会被保留。到目前为止,我已经尝试过 Data.HashTable(还可以,但有点慢)并尝试过 Data.Array.Judy,但我无法使其与 GHC 6.10.4 一起使用。还有其他选择吗?
I am looking for a mutable (balanced) tree/map/hash table in Haskell or a way how to simulate it inside a function. I.e. when I call the same function several times, the structure is preserved. So far I have tried Data.HashTable (which is OK, but somewhat slow) and tried Data.Array.Judy but I was unable to make it work with GHC 6.10.4. Are there any other options?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果你想要可变状态,你可以拥有它。只需继续传递更新后的地图,或将其保留在状态单子中(结果是相同的事情)。
你可以像这样使用它。 (实际上,您可能还想添加一种从缓存中清除项目的方法。)
您可以不安全摆脱线程状态的要求,需要您自担风险通过一切需要它的事情。
If you want mutable state, you can have it. Just keep passing the updated map around, or keep it in a state monad (which turns out to be the same thing).
You can use this like so. (In practice, you might want to add a way to clear items from the cache, too.)
At your own risk, you can unsafely escape from the requirement of threading state through everything that needs it.
基于@Ramsey 的答案,我还建议您重新构思您的函数来获取地图并返回修改后的地图。然后使用良好的数据进行编码.Map,修改起来非常高效。这是一个模式:
很容易抽象这个模式并使 mapFuncWithMap 对以这种方式使用映射的函数具有通用性。
Building on @Ramsey's answer, I also suggest you reconceive your function to take a map and return a modified one. Then code using good ol' Data.Map, which is pretty efficient at modifications. Here is a pattern:
It is easy to abstract this pattern and make mapFuncWithMap generic over functions that use maps in this way.
尽管您要求可变类型,但我建议您使用不可变数据结构,并将连续版本作为参数传递给函数。
关于使用哪种数据结构,
有一个 肯特郡红黑树的实现
如果你有整数键,
Data.IntMap
效率极高。如果您有字符串键,则
bytestring-trie
包来自Hackage 看起来非常好。如果幸运的话,您可以将表数据结构作为额外参数传递给每个需要它的函数。但是,如果您的表需要广泛分布,您可能希望使用 state monad 其中状态是表的内容。
如果你想记忆,你可以尝试 Conal Elliott 博客中的一些惰性记忆技巧,但是一旦你超越了整数参数,惰性记忆就变得非常模糊——我不建议你作为初学者尝试。也许您可以发布有关您试图解决的更广泛问题的问题?通常,对于 Haskell 和可变性,问题是如何将突变或更新包含在某种范围内。
在没有任何全局可变变量的情况下学习编程并不是那么容易。
Although you ask for a mutable type, let me suggest that you use an immutable data structure and that you pass successive versions to your functions as an argument.
Regarding which data structure to use,
There is an implementation of red-black trees at Kent
If you have integer keys,
Data.IntMap
is extremely efficient.If you have string keys, the
bytestring-trie
package from Hackage looks very good.If you're lucky, you can pass your table data structure as an extra parameter to every function that needs it. If, however, your table needs to be widely distributed, you may wish to use a state monad where the state is the contents of your table.
If you are trying to memoize, you can try some of the lazy memoization tricks from Conal Elliott's blog, but as soon as you go beyond integer arguments, lazy memoization becomes very murky—not something I would recommend you try as a beginner. Maybe you can post a question about the broader problem you are trying to solve? Often with Haskell and mutability the issue is how to contain the mutation or updates within some kind of scope.
It's not so easy learning to program without any global mutable variables.
如果我正确地阅读了您的评论,那么您将拥有一个可能需要计算约 500k 总值的结构。计算成本很高,因此您希望它们只完成一次,并且在后续访问中,您只需要值而不需要重新计算。
在这种情况下,请利用 Haskell 的惰性来发挥你的优势! ~500k 并不是那么大:只需构建所有答案的地图,然后根据需要获取即可。第一次获取将强制计算,后续获取相同答案将重用相同结果,如果您从未获取特定计算 - 它永远不会发生!
您可以在文件 PointCloud.hs。该文件使用 Debug.Trace 来记录计算实际完成的时间:
If I read your comments right, then you have a structure with possibly ~500k total values to compute. The computations are expensive, so you want them done only once, and on subsequent accesses, you just want the value without recomputation.
In this case, use Haskell's laziness to your advantage! ~500k is not so big: Just build a map of all the answers, and then fetch as needed. The first fetch will force computation, subsequent fetches of the same answer will reuse the same result, and if you never fetch a particular computation - it never happens!
You can find a small implementation of this idea using 3D point distances as the computation in the file PointCloud.hs. That file uses
Debug.Trace
to log when the computation actually gets done:对纯函数字典(例如
Data.Map
)的可变引用。A mutable reference to a purely functional dictionary like
Data.Map
.