在函数式编程上下文中使用不可变列表获取恒定长度检索时间常数
我目前面临的问题是必须根据给定列表的长度进行计算。必须迭代列表的所有元素才能知道其大小,这是一个很大的性能损失,因为我使用的是相当大的列表。
解决该问题的建议方法是什么?
我想我总是可以将大小值与列表一起携带,这样我就可以事先知道它的大小,而不必在调用站点计算它,但这似乎是一种脆弱的方法。我还可以定义自己的列表类型,其中每个节点都有其列表大小作为属性,但这样我就会失去编程语言的标准列表库提供的杠杆作用。
你们在日常生活中是如何处理这个问题的呢?
我目前正在使用 F#。我知道我可以使用.NET 的可变(数组)列表,这可以解决问题。不过,我对纯粹的不可变函数方法更感兴趣。
I am currently facing the problem of having to make my calculations based on the length of a given list. Having to iterate over all the elements of the list to know its size is a big performance penalty as I'm using rather big lists.
What are the suggested approaches to the problem?
I guess I could always carry a size value together with the list so I know beforehand its size without having to compute it at the call site but that seems a brittle approach. I could also define a own type of list where each node has as property its the lists' size but then I'd lose the leverage provided by my programming language's libraries for standard lists.
How do you guys handle this on your daily routine?
I am currently using F#. I am aware I can use .NET's mutable (array) lists, which would solve the problem. I am way more interested, though, in the purely immutable functional approach.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
内置的 F# 列表类型没有任何长度缓存,并且无法以某种巧妙的方式添加它,因此您需要定义自己的类型。我认为为现有的 F#
list
类型编写包装器可能是最好的选择。这样,您可以避免显式转换 - 当您包装列表时,它实际上不会复制它(如在 svick 的实现中),但包装器可以轻松缓存
Length
属性:最好的工作方式使用包装器的方法是使用
LengthList.ofList
从标准 F# 列表创建LengthList
并使用LengthList.toList
(或仅使用LengthList.toList
>列表) 属性在使用标准List
模块中的任何函数之前。但是,这取决于代码的复杂性 - 如果您只需要几个地方的长度,那么将其单独保存并使用元组
list<'T> 可能会更容易。 * int
。The built-in F# list type doesn't have any caching of the length and there is no way to add that in some clever way, so you'll need to define your own type. I think that writing a wrapper for the existing F#
list
type is probably the best option.This way, you can avoid explicit conversions - when you wrap the list, it will not actually copy it (as in svick's implementation), but the wrapper can easily cache the
Length
property:The best way to work with the wrapper is to use
LengthList.ofList
for creatingLengthList
from a standard F# list and to useLengthList.toList
(or just theList
) property before using any functions from the standardList
module.However, it depends on the complexity of your code - if you only need length in a couple of places, then it may be easier to keep it separately and use a tuple
list<'T> * int
.我们不这样做,因为这在日常生活中不是问题。这听起来像是一个在有限领域可能存在的问题。
如果您最近创建了列表,那么您可能已经完成了 O(N) 工作,因此遍历列表来获取其长度可能并不是什么大问题。
如果您正在创建一些非常大的列表,并且这些列表不会“改变”太多(显然永远不会改变,但我的意思是更改对您的域/算法中使用的列表头的引用集),那么可能有意义只需在引用列表头*长度元组的一侧放一本字典,并在询问长度时查阅字典(在需要时进行真正的工作来遍历它们,但缓存结果以供将来询问相同的列表) 。
最后,如果您确实正在处理一些需要不断更新正在运行的列表并不断查询长度的算法,那么创建您自己的类似列表的数据类型(是的,您还需要编写映射/过滤器和任何其他的)。
(一般来说,我认为 99.99% 的时间最好使用内置数据结构。在 0.01% 的时间里,您正在开发需要高度优化的算法或代码,那么几乎总是您需要放弃内置数据结构(对于大多数情况来说已经足够了)并使用旨在解决您正在处理的确切问题的自定义数据结构。请参阅维基百科或冈崎的“纯功能数据”。在这种情况下,结构”用于思想和灵感。但很少去那种情况。)
We don't, because this isn't a problem in daily routine. It sounds like a problem perhaps in limited domains.
If you created the lists recently, then you've probably already done O(N) work, so walking the list to get its length is probably not a big deal.
If you're making a few very large lists that are then not 'changing' much (obviously never changing, but I mean changing set of references to heads of lists that are used in your domain/algorithm), then it may make sense to just have a dictionary off to the side of reference-to-list-head*length tuples, and consult the dictionary when asking for lengths (doing the real work to walk them when needed, but caching results for future asks about the same list).
Finally, if you really are dealing with some algorithm that needs to be constantly updating the lists in play and constantly consulting the lengths, then create your own list-like data type (yes, you'll also need to write map/filter and any others).
(Very generally, I think it is typically best to use the built-in data structures 99.99% of the time. In the 0.01% of the time where you are developing an algorithm or bit of code that needs to be very highly optimized, then almost always you need to abandon built-in data structures (which are good enough for most cases) and use a custom data structure designed to solve the exact problem you are working on. Look to wikipedia or Okasaki's "'Purely Functional Data Structures" for ideas and inspriation in that case. But rarely go to that case.)
我不明白为什么携带长度是一种脆弱的方法。尝试这样的事情(Haskell):
等等。如果这是在库或模块中,您可以通过不导出构造函数
NList
来使其安全(我认为)。也有可能强制 GHC 记忆
length
,但我不确定如何或何时。I don't see why carying the length around is a brittle approach. Try something like this (Haskell):
and so on. If this is in a library or module, you can make it safe (I think) by not exporting the constructor
NList
.It may also be possible to coerce GHC into memoizing
length
, but I'm not sure how or when.在 F# 中,大多数
List
函数都有等效的Seq
函数。这意味着,您可以实现自己的不可变链表,其中包含每个节点的长度。像这样的东西:In F#, most
List
functions have an equivalentSeq
functions. That means, you can just implement your own immutable linked list that carries the length with each node. Something like this: