Common Lisp:在非常大的列表上使用这个过滤函数有什么缺点?
我想从列表 'b 中过滤掉列表 'a 的所有元素并返回过滤后的 'b。这是我的功能:
(defun filter (a b)
"Filters out all items in a from b"
(if (= 0 (length a)) b
(filter (remove (first a) a) (remove (first a) b))))
我是 lisp 新手,不知道 'remove 是如何工作的,这个过滤器会在什么时间运行?
I want to filter out all elements of list 'a from list 'b and return the filtered 'b. This is my function:
(defun filter (a b)
"Filters out all items in a from b"
(if (= 0 (length a)) b
(filter (remove (first a) a) (remove (first a) b))))
I'm new to lisp and don't know how 'remove does its thing, what kind of time will this filter run in?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
有两种方法可以找到答案:
你可以用数据测试它
你可以分析你的源代码
让我们看一下源代码。
列表是由链接的cons单元构建的
长度需要遍历列表一次
对于每个递归调用 FILTER 计算 a 的长度。糟糕!
(使用 ENDP 代替。)
REMOVE 需要遍历列表一次
对于每次计算 REMOVE 两次的递归调用:糟糕!
(不要在
a
上使用 REMOVE,而是使用 REST 进行递归。)对 FILTER 的调用不一定是优化的尾部调用。
在某些实现中可能会这样,在某些实现中您需要告诉编译器
在某些实现中,您想要优化尾部调用
没有可用的尾调用优化。如果没有,那么你会得到一个堆栈
在足够长的列表上溢出。
(如果适用,请使用 DO、DOLIST、DOTIMES、LOOP、REDUCE、MAPC、MAPL、MAPCAR、MAPLIST、MAPCAN 或 MAPCON 等循环结构代替递归。)
总结:这是非常幼稚的代码,性能很差表现。
Common Lisp 提供了这个内置功能:SET-DIFFERENCE 应该做你想做的事。
http://www.lispworks.com/documentation/HyperSpec/Body /f_set_di.htm#set-difference
There are two ways to find out:
you could test it with data
you could analyze your source code
Let's look at the source code.
lists are built of linked cons cells
length needs to walk once through a list
for EVERY recursive call of FILTER you compute the length of a. BAD!
(Use ENDP instead.)
REMOVE needs to walk once through a list
for every recursive call you compute REMOVE twice: BAD!
(Instead of using REMOVE on
a
, recurse with the REST.)the call to FILTER will not necessarily be an optimized tail call.
In some implementations it might, in some you need to tell the compiler
that you want to optimize for tail calls, in some implementations
no tail call optimization is available. If not, then you get a stack
overflow on long enough lists.
(Use looping constructs like DO, DOLIST, DOTIMES, LOOP, REDUCE, MAPC, MAPL, MAPCAR, MAPLIST, MAPCAN, or MAPCON instead of recursion, when applicable.)
Summary: that's very naive code with poor performance.
Common Lisp provides this built in: SET-DIFFERENCE should do what you want.
http://www.lispworks.com/documentation/HyperSpec/Body/f_set_di.htm#set-difference
Common Lisp 不支持 尾部调用优化(根据标准),您可能只是用一个糟糕的调用堆栈耗尽内存(取决于实现)。
Common Lisp does not support tail-call optimization (as per the standard) and you might just run out of memory with an abysmal call-stack (depending on the implementation).
我不会编写这个函数,因为 Rainer Joswig 说,该标准已经提供了
SET-DIFFERENCE
。尽管如此,如果我必须提供该函数的实现,我会使用这个实现:这样做有几个优点,最重要的一个是它只遍历
b
一次;如果a
很长,那么使用哈希表来跟踪a
中的元素可能会表现得更好。此外,使用
MAP
和REMOVE-IF
等通用序列函数意味着该函数可以与字符串、向量以及列表一起使用,这甚至比标准SET-DIFFERENCE
函数。这种方法的主要缺点是,如果您想使用:TEST
参数扩展函数,该参数允许用户提供除默认EQL
之外的等式谓词,因为 CL 哈希-tables 仅适用于少量预定义的相等谓词(EQ
、EQL
、EQUAL
和EQUALP
准确地说)。I would not write this function, becuase, as Rainer Joswig says, the standard already provides
SET-DIFFERENCE
. Nonetheless, if I had to provide an implementation of the function, this is the one I would use:Doing it this way provides a couple of advantages, the most important one being that it only traverses
b
once; using a hash table to keep track of what elements are ina
is likely to perform much better ifa
is long.Also, using the generic sequence functions like
MAP
andREMOVE-IF
mean that this function can be used with strings and vectors as well as lists, which is an advantage even over the standardSET-DIFFERENCE
function. The main downside of this approach is if you want extend the function with a:TEST
argument that allows the user to provide an equality predicate other than the defaultEQL
, since CL hash-tables only work with a small number of pre-defined equality predicates (EQ
,EQL
,EQUAL
andEQUALP
to be precise).