R:在向量中的旧零之后添加零?
想象一下,我有一个带有 1 和 0 的向量,
我将其紧凑地编写为:
1111111100001111111111110000000001111111111100101
我需要一个新的向量,将零后面的“N”个替换为新的零。
例如,对于 N = 3。
1111111100001111111111110000000001111111111100101 变为 1111111100000001111111110000000000001111111100000
我可以用 for 循环来做到这一点,但我读过这不是一个好的做法,那么我该怎么做呢?
干杯
我的矢量确实是动物园系列,但我想这没有任何区别。 如果我想要零到最后我会使用 cumprod。
Imagine I have a vector with ones and zeroes
I write it compactly:
1111111100001111111111110000000001111111111100101
I need to get a new vector replacing the "N" ones following the zeroes to new zeroes.
For example for N = 3.
1111111100001111111111110000000001111111111100101 becomes
1111111100000001111111110000000000001111111100000
I can do it with a for loop but I've read is not a good practice, How can I do it then?
cheers
My vector is a zoo series, indeed, but I guess it doesn't make any difference.
If I wanted zeroes up to end I would use cumprod.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
您还可以使用 rle 来完成此操作。您需要做的就是将 n 添加到值为 0 的所有长度,并在值为 1 时减去 n(当一行中少于 n 个时要小心一点)。 (使用Greg的方法构建样本)
You can also do this with
rle
. All you need to do is add n to all the lengths where the value is 0 and subtract n when the value is 1 (being a little bit careful when there are less than n ones in a row). (Using Greg's method to construct the sample)只循环遍历(假设很少)N 个实例怎么样:
只需将所有零实例变为 -1,以便减去 N 个后续值。
编辑:
在阅读了 R-help 邮件列表中的数据描述后,这显然不是小 N 的情况。因此,您可能需要考虑使用 C 函数来实现此目的。
在文件“addZeros.c”中:
在命令提示符(Windows 中的 MS DOS,按 Win+r 并写入 cmd)中,写入“R CMD SHLIB addZeros.c”。如果无法到达 R 的路径(即“未知命令 R”),您需要说明完整地址(在我的系统上:
在 Windows 上,这应该生成一个 DLL(在 Linux 中为 .so),但如果您还没有 R -toolbox 您应该下载并安装它(它是一个工具集合,例如 Perl 和 Mingw)。
http://www.murdoch-sutherland.com/Rtools/
R 包装函数这将是:
请注意,R 中的工作目录应与之前的 DLL 相同(在我的系统上
setwd("C:/Users/eyjo/Documents/Forrit/R/addZeros")
)第一次调用 addZeros R 函数(或者,在 dyn.load 中仅包含 dll 文件的完整路径)。最好将它们保存在项目下的子目录中(即“c”),然后只需在文件路径中的“addZeros”前面添加“c/”。举例说明:
其中“addZeros”是我最初的建议,仅使用内部 R,而 addZeros2 使用 C 函数。
How about just looping through the (assuming few) N instances:
Simply turns all zero instances into -1 in order to subtract the N succeeding values.
EDIT:
After reading your description of the data in the R-help mailing list, this clearly is not a case of small N. Hence, you might want to consider a C function for this.
In the file "addZeros.c":
In command prompt (MS DOS in Windows, press Win+r and write cmd), write "R CMD SHLIB addZeros.c". If the path to R is not attainable (i.e. "unknown kommand R") you need to state full address (on my system:
On Windows this should produce a DLL (.so in Linux), but if you do not already have the R-toolbox you should download and install it (it is a collection of tools, such as Perl and Mingw). Download the newest version from
http://www.murdoch-sutherland.com/Rtools/
The R wrapper function for this would be:
Note that the working directory in R should be the same as the DLL (on my system
setwd("C:/Users/eyjo/Documents/Forrit/R/addZeros")
) before the addZeros R function is called the first time (alternatively, indyn.load
just include the full path to the dll file). It is good practice to keep these in a sub-directory under the project (i.e. "c"), then just add "c/" in front of "addZeros" in the file path.To illustrate:
Where the "addZeros" is my original suggestion with just internal R, and addZeros2 is using the C function.
这是一种方法:
这是否比循环更好取决于您。
如果前 n 个元素中有 0,这也不会改变。
这是另一种方法:
Here is one way:
whether this is better than a loop or not is up to you.
This will also not change the 1st n elements if there is a 0 there.
here is another way:
跟进我之前的评论,如果速度实际上是一个问题 - 将向量转换为字符串并使用正则表达式可能比其他解决方案更快。第一个函数:
生成数据
系统时间崩溃,运行正则表达式,然后拆分回向量:
To follow up on my previous comment, if speed is in fact a concern - converting the vector to a string and using regex may well be faster than other solutions. First a function:
Generate data
System time to collapse, run regex, and split back into vector:
我真的很喜欢使用“正则表达式”的想法,所以我对此投了赞成票。 (希望我也得到了 rle 答案,并从嵌入和运行答案中学到了一些东西。很好!)这是 Chase 答案的一个变体,我认为它可以解决所提出的问题:
这似乎与 Chang 的 rle 方法产生相同的结果= 1,2,3,4,5(gd047 的示例输入)。
也许你可以使用 \K 来更干净地写这个?
I really like the idea of using a "regular expression" for this so I gave a vote up for that. (Wish I had gotten an rle answer in too and learned something from the embed and running answers. Neat!) Here's a variation on Chase's answer that I think may address the issues raised:
This seems to produce identical results to Chang's rle method for n = 1,2,3,4,5 on gd047's example input.
Maybe you could write this more cleanly using \K?
我自己找到了解决方案。
我认为这很容易而且不是很慢。
我想如果有人可以用 C++ 编译它,它会非常快,因为它只有一个循环。
I've found a solution myself.
I think it's very easy and not very slow.
I guess if someone could compile it in C++ it would be very fast because it has just one loop.
使用移动最小值函数非常快速、简单,并且不依赖于跨度分布:
以下 movmin 的简单定义就足够了(完整的函数具有一些对于这种情况多余的功能,例如使用 van Herk/Gil-Werman 算法对于大 N)
实际上您需要窗口大小为 4,因为您会影响零后面的 3 个值。这与您的 f5 匹配:
Using a moving minimum function is very fast, simple, and not dependent on the distribution of spans:
The following simple definition of movmin suffices (the complete function has some functionality superfluous to this case, such as using the van Herk/Gil-Werman algorithm for large N)
Actually you need a window size of 4 because you affect the 3 values following a zero. This matches your f5: