在 J 中就地修改列表的元素,可以吗?
我一直在玩 J 中的 Lookandsay (OEIS A005150) 实现。我制作了两个版本,都非常简单,使用 while.
类型控制结构。一个重复,另一个循环。因为我有强迫症,所以我开始对版本进行比较计时。
看起来和说的是序列 1 11 21 1211 111221 ,一个一,两个一,等等。
对于列表的早期元素(最多大约 20 个),循环版本获胜,但只有很小的数量。 30 左右的计时会导致递归版本获胜,其数量足够大,如果堆栈空间足以支持递归版本,则可能会首选递归版本。我研究了原因,我相信这与处理中间结果有关。序列中的第 30 个数字有 5808 个数字。 (第32个数字,9898位,第34个,16774。)
当你用递归做问题时,你可以在递归调用中保存中间结果,最后的拆栈构建结果,这样对结果的处理就最少。
在列表版本中,您需要一个变量来保存结果。每次循环迭代都会导致您需要向结果添加两个元素。
在我看来,问题在于我无法在 J 中找到任何方法来修改现有数组而不完全重新分配它。所以我想说的是,
try. o =. o,e,(0&{y) catch. o =. e,(0&{y) end.
将一个元素放入 o 中,而当我们开始时,o 可能没有值。这可能明显慢于
o =. i.0
.
.
.
o =. (,o),e,(0&{y)
关键是结果得到错误的形状而没有散线,或者看起来是这样。它以某种方式继承了 i.0 的形状。
但即使像 } amend 这样的函数也不会修改列表,它们会返回一个对其进行了修改的列表,如果您想保存该列表,则需要对其进行分配。随着分配列表的大小增加(当您从头到尾遍历数字以生成下一个数字时),分配似乎需要越来越多的时间。这个分配实际上是我能看到的唯一能让元素 32(9898 位)在递归版本中花费更少时间,而元素 20(408 位)在循环版本中花费更少时间的事情。
递归版本使用以下方式构建返回:
e,(0&{y),(,lookandsay e }. y)
上面的行既是函数的返回行,也是递归的返回行,因此当调用到达字符串末尾并且所有内容都解除堆栈时,整个返回向量会立即构建。
在 APL 中,我认为可以说以下内容:
a[1+rho a] <- new element
但是当我在 NARS2000 中尝试此操作时,我发现它会导致索引错误。我无法访问任何其他 APL,我可能记得 APL Plus 中的这个习惯用法,我怀疑它在 APL\360 或 APL\1130 中是否以这种方式工作。我可能完全记错了。
我在 J 中找不到办法做到这一点。可能没有办法做到这一点,但下一个想法是预先分配一个可以保存结果的数组,并更改单个条目。我也看不出有什么办法可以做到这一点 - 也就是说,J 似乎不支持 APL 习惯用法:
a<- iota 5
a[3] <- -1
这是因为语言纯度而被禁止的副作用之一吗?
解释器是否识别a=. a,foo
或者它的一些变体作为一个东西,它应该在内部快速路径到 a[>:#a]=.foo
吗?
这是递归版本,只是为了好玩。我尝试了很多不同的版本,我相信程序越长,速度就越慢,而且通常越复杂,速度就越慢。一般来说,程序可以被链接起来,这样如果你想要第n个数字,你可以执行lookandsay^: n ] y。我尝试了许多优化,但我遇到的问题是我无法判断我将输出发送到哪个环境。如果我知道我正在将其发送到程序的下一次迭代,我会将其作为数字数组而不是一个大数字发送。
我还怀疑,如果我能弄清楚如何制作代码的默认版本,它会运行得更快,因为我发现当我向代码中添加一些应该使其更短的内容时,它会运行得更长。
lookandsay=: 3 : 0
if. 0 = # ,y do. return. end. NB. return on empty argument
if. 1 ~: #@$ y do. NB. convert rank 0 argument to list of digits
y =. (10&#.^:_1) x: y
f =. 1
assert. 1 = #@$ y NB. the converted argument must be rank 1
else.
NB. yw =. y
f =. 0
end.
NB. e should be a count of the digits that match the leading digit.
e=.+/*./\y=0&{y
if. f do.
o=. e,(0&{y),(,lookandsay e }. y)
assert. e = 0&{ o
10&#. x: o
return.
else.
e,(0&{y),(,lookandsay e }. y)
return.
end.
)
我对产生的数字的特征很感兴趣。我发现,如果你从 1 开始,数字永远不会高于 3。如果你从高于 3 的数字开始,它将作为单例生存,并且你还可以通过从某些东西开始,将一个数字放入生成的数字中比如 888888888 会生成一个数字,其中有一个 9,数字末尾有一个 8。但除了单例之外,没有数字高于 3。
编辑: 我又做了一些测量。我最初编写的程序接受向量或标量,其想法是在内部我将使用向量。我曾考虑过将向量从一层代码传递到另一层,并且我仍然可能使用左参数来控制代码。当我传递顶层向量时,代码的运行速度会大大加快,所以我的猜测是,大部分 CPU 都是通过将很长的数字从向量转换为数字而被消耗掉的。递归例程在递归时总是向下传递一个向量,这可能就是为什么它几乎与循环一样快的原因。
这并没有改变我的问题。
我对此有一个答案,但三个小时内无法发布。然后我会发布它,请不要做大量研究来回答它。
I have been playing with an implementation of lookandsay (OEIS A005150) in J. I have made two versions, both very simple, using while.
type control structures. One recurs, the other loops. Because I am compulsive, I started running comparative timing on the versions.
look and say is the sequence 1 11 21 1211 111221 that s, one one, two ones, etc.
For early elements of the list (up to around 20) the looping version wins, but only by a tiny amount. Timings around 30 cause the recursive version to win, by a large enough amount that the recursive version might be preferred if the stack space were adequate to support it. I looked at why, and I believe that it has to do with handling intermediate results. The 30th number in the sequence has 5808 digits. (32nd number, 9898 digits, 34th, 16774.)
When you are doing the problem with recursion, you can hold the intermediate results in the recursive call, and the unstacking at the end builds the results so that there is minimal handling of the results.
In the list version, you need a variable to hold the result. Every loop iteration causes you to need to add two elements to the result.
The problem, as I see it, is that I can't find any way in J to modify an extant array without completely reassigning it. So I am saying
try. o =. o,e,(0&{y) catch. o =. e,(0&{y) end.
to put an element into o where o might not have a value when we start. That may be notably slower than
o =. i.0
.
.
.
o =. (,o),e,(0&{y)
The point is that the result gets the wrong shape without the ravels, or so it seems. It is inheriting a shape from i.0 somehow.
But even functions like } amend don't modify a list, they return a list that has a modification made to it, and if you want to save the list you need to assign it. As the size of the assigned list increases (as you walk the the number from the beginning to the end making the next number) the assignment seems to take more time and more time. This assignment is really the only thing I can see that would make element 32, 9898 digits, take less time in the recursive version while element 20 (408 digits) takes less time in the loopy version.
The recursive version builds the return with:
e,(0&{y),(,lookandsay e }. y)
The above line is both the return line from the function and the recursion, so the whole return vector gets built at once as the call gets to the end of the string and everything unstacks.
In APL I thought that one could say something on the order of:
a[1+rho a] <- new element
But when I try this in NARS2000 I find that it causes an index error. I don't have access to any other APL, I might be remembering this idiom from APL Plus, I doubt it worked this way in APL\360 or APL\1130. I might be misremembering it completely.
I can find no way to do that in J. It might be that there is no way to do that, but the next thought is to pre-allocate an array that could hold results, and to change individual entries. I see no way to do that either - that is, J does not seem to support the APL idiom:
a<- iota 5
a[3] <- -1
Is this one of those side effect things that is disallowed because of language purity?
Does the interpreter recognize a=. a,foo
or some of its variants as a thing that it should fastpath to a[>:#a]=.foo
internally?
This is the recursive version, just for the heck of it. I have tried a bunch of different versions and I believe that the longer the program, the slower, and generally, the more complex, the slower. Generally, the program can be chained so that if you want the nth number you can do lookandsay^: n ] y. I have tried a number of optimizations, but the problem I have is that I can't tell what environment I am sending my output into. If I could tell that I was sending it to the next iteration of the program I would send it as an array of digits rather than as a big number.
I also suspect that if I could figure out how to make a tacit version of the code, it would run faster, based on my finding that when I add something to the code that should make it shorter, it runs longer.
lookandsay=: 3 : 0
if. 0 = # ,y do. return. end. NB. return on empty argument
if. 1 ~: #@$ y do. NB. convert rank 0 argument to list of digits
y =. (10.^:_1) x: y
f =. 1
assert. 1 = #@$ y NB. the converted argument must be rank 1
else.
NB. yw =. y
f =. 0
end.
NB. e should be a count of the digits that match the leading digit.
e=.+/*./\y=0&{y
if. f do.
o=. e,(0&{y),(,lookandsay e }. y)
assert. e = 0&{ o
10. x: o
return.
else.
e,(0&{y),(,lookandsay e }. y)
return.
end.
)
I was interested in the characteristics of the numbers produced. I found that if you start with a 1, the numerals never get higher than 3. If you start with a numeral higher than 3, it will survive as a singleton, and you can also get a number into the generated numbers by starting with something like 888888888 which will generate a number with one 9 in it and a single 8 at the end of the number. But other than the singletons, no digit gets higher than 3.
Edit:
I did some more measuring. I had originally written the program to accept either a vector or a scalar, the idea being that internally I'd work with a vector. I had thought about passing a vector from one layer of code to the other, and I still might using a left argument to control code. With I pass the top level a vector the code runs enormously faster, so my guess is that most of the cpu is being eaten by converting very long numbers from vectors to digits. The recursive routine always passes down a vector when it recurs which might be why it is almost as fast as the loop.
That does not change my question.
I have an answer for this which I can't post for three hours. I will post it then, please don't do a ton of research to answer it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
像这样的作业
就地执行。 (有关其他受支持的内容,请参阅 JWiki 文章地点操作)
解释器确定仅更新了 arr 的一小部分,并且不会创建整个新列表来重新分配。
在您的情况下发生的情况并不是数组被重新分配,而是它以小增量增长了很多次,导致内存分配和重新分配。
如果您预先分配(通过为其分配一些大数据块),那么您可以使用
}
对其进行修改,而不会造成太大的损失。assignments like
are executed in place. (See JWiki article for other supported in-place operations)
Interpreter determines that only small portion of
arr
is updated and does not create entire new list to reassign.What happens in your case is not that array is being reassigned, but that it grows many times in small increments, causing memory allocation and reallocation.
If you preallocate (by assigning it some large chunk of data), then you can modify it with
}
without too much penalty.说实话,在我问了这个问题之后,我就失去了这个网站的踪迹。
是的,答案是该语言没有表示“就地更新”的形式,但是如果您使用两种形式
x =: x , most everything
或
x =: most everything } x
那么解释器会将它们识别为特殊的并在除非不能,还有许多其他特殊符号可以被解释器识别,例如:
199(1000&|@^)199
组合运算是模幂运算,它从不计算整个幂运算。就像
199(1000&|^)199
那样 - 它只是以 _ 结尾,没有 @。
所以值得阅读有关特价的文章。
After I asked this question, to be honest, I lost track of this web site.
Yes, the answer is that the language has no form that means "update in place, but if you use two forms
x =: x , most anything
or
x =: most anything } x
then the interpreter recognizes those as special and does update in place unless it can't. There are a number of other specials recognized by the interpreter, like:
199(1000&|@^)199
That combined operation is modular exponentiation. It never calculates the whole exponentiation, as
199(1000&|^)199
would - that just ends as _ without the @.
So it is worth reading the article on specials. I will mark someone else's answer up.
sverre 上面提供的链接( http://www.jsoftware.com/jwiki/ Essays/In-Place%20Operations )显示了支持修改现有数组而不是创建新数组的各种操作。它们包括:
如果您对 Lookandsay 序列的默认版本感兴趣,请参阅此提交 罗塞塔代码:
The link that sverre provided above ( http://www.jsoftware.com/jwiki/Essays/In-Place%20Operations ) shows the various operations that support modifying an existing array rather than creating a new one. They include:
If you are interested in a tacit version of the lookandsay sequence see this submission to RosettaCode: