比较同一向量的相邻元素(避免循环)
我设法编写了一个 for 循环
来比较以下向量中的字母:
bases <- c("G","C","A","T")
test <- sample(bases, replace=T, 20)
test
将返回
[1] "T" "G" "T" "G" "C" "A" "A" "G" "A" "C" "A" "T" "T" "T" "T" "C" "A" "G" "G" "C"
函数 Comp()
我可以检查是否字母与下一个字母匹配导致
Comp <- function(data)
{
output <- vector()
for(i in 1:(length(data)-1))
{
if(data[i]==data[i+1])
{
output[i] <-1
}
else
{
output[i] <-0
}
}
return(output)
}
;
> Comp(test)
[1] 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 1 0
这是可行的,但是对于大量数据来说速度非常慢。因此,我尝试了 sapply()
Comp <- function(x,i) if(x[i]==x[i+1]) 1 else 0
unlist(lapply(test, Comp, test))
不幸的是它不起作用...(Error in i + 1 : non-numeric argument to 二元运算符
)我无法弄清楚如何访问向量中的前一个字母进行比较。另外,“不比较”最后一个字母的“length(data)-1”可能会成为问题。
谢谢大家的帮助!
干杯 幸运的
I managed to write a for loop
to compare letters in the following vector:
bases <- c("G","C","A","T")
test <- sample(bases, replace=T, 20)
test
will return
[1] "T" "G" "T" "G" "C" "A" "A" "G" "A" "C" "A" "T" "T" "T" "T" "C" "A" "G" "G" "C"
with the function Comp()
I can check if a letter is matching to the next letter
Comp <- function(data)
{
output <- vector()
for(i in 1:(length(data)-1))
{
if(data[i]==data[i+1])
{
output[i] <-1
}
else
{
output[i] <-0
}
}
return(output)
}
Resulting in;
> Comp(test)
[1] 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 1 0
This is working, however its verry slow with large numbers. Therefor i tried sapply()
Comp <- function(x,i) if(x[i]==x[i+1]) 1 else 0
unlist(lapply(test, Comp, test))
Unfortunately its not working... (Error in i + 1 : non-numeric argument to binary operator
) I have trouble figuring out how to access the preceding letter in the vector to compare it. Also the length(data)-1
, to "not compare" the last letter might become a problem.
Thank you all for the help!
Cheers
Lucky
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
只需“滞后”
test
并使用矢量化的==
。更新:
此外,您的
Comp
函数速度很慢,因为您在初始化时没有指定output
的长度。我怀疑您正在尝试预分配,但是vector()
创建了一个零长度向量,必须在循环的每次迭代期间扩展该向量。如果将对vector()
的调用更改为vector(length=NROW(data)-1)
,您的Comp
函数会显着加快。Just "lag"
test
and use==
, which is vectorized.Update:
Also, your
Comp
function is slow because you don't specify the length ofoutput
when you initialize it. I suspect you were trying to pre-allocate, butvector()
creates a zero-length vector that must be expanded during every iteration of your loop. YourComp
function is significantly faster if you change the call tovector()
tovector(length=NROW(data)-1)
.正如@Joshua 所写,你当然应该使用矢量化——它效率更高。
...但仅供参考,您的
Comp
函数仍然可以优化一点。比较的结果是
TRUE/FALSE
,它是1/0
的美化版本。此外,确保结果是整数而不是数字会消耗一半的内存。...以及速度差异:
As @Joshua wrote, you should of course use vectorization - it is way more efficient.
...But just for reference, your
Comp
function can still be optimized a bit.The result of a comparison is
TRUE/FALSE
which is glorified versions of1/0
. Also, ensuring the result is integer instead of numeric consumes half the memory....and the speed difference:
看看这个:
可能工作得更快。
Have a look at this :
Might work faster.