从 for 循环到应用
我是 R 的新手。 所以我不知道如何使用apply。 我想使用 apply 来加速我的函数:
for(i in 1: ncol(exp)){
for (j in 1: length(fe)){
tmp =TRUE
id = strsplit(colnames(exp)[i],"\\.")
if(id == fe[j]){
tmp = FALSE
}
if(tmp ==TRUE){
only = cbind(only,c(names(exp)[i],exp[,i]) )
}
}
}
如何使用 apply 函数来执行上述操作?
编辑:
非常感谢您的很好的解释,并对我的错误描述表示歉意。你猜的一切都是对的,但是当想要删除 fe 中的匹配项时。
Exp <- data.frame(A.x=1:10,B.y=10:1,C.z=11:20,A.z=20:11)
fe<-LETTERS[1:2]
那么结果应该只是带有“C”的同名。其他所有内容都应该删除。
1 C.z
2 11
3 12
4 13
5 14
6 15
7 16
8 17
9 18
10 19
11 20
I am new in using R.
So I am not sure about how to use apply.
I would like to speed up my function with using apply:
for(i in 1: ncol(exp)){
for (j in 1: length(fe)){
tmp =TRUE
id = strsplit(colnames(exp)[i],"\\.")
if(id == fe[j]){
tmp = FALSE
}
if(tmp ==TRUE){
only = cbind(only,c(names(exp)[i],exp[,i]) )
}
}
}
How can I use the apply function to do this above?
EDIT :
Thank you so much for the very good explanation and sorry for my bad description. You guess everything right, but When wanted to delete matches in fe.
Exp <- data.frame(A.x=1:10,B.y=10:1,C.z=11:20,A.z=20:11)
fe<-LETTERS[1:2]
then the result should be only colnames with 'C'. Everything else should be deleted.
1 C.z
2 11
3 12
4 13
5 14
6 15
7 16
8 17
9 18
10 19
11 20
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
编辑:如果您只想删除名称出现在 fe 中的列,您可以简单地执行以下操作:
此代码也完全按照您的(更新的)for 循环执行操作,只是效率更高。您不必循环遍历 fe,
%in%
函数已向量化。如果名称可以出现在点之间的任何位置,那么
您的代码会做一些非常有趣的事情,而我不知道您到底想做什么。首先,
strsplit
给出一个列表,因此id == fe[j]
将始终返回false,除非fe[j]
是一个列表本身。我怀疑它是......所以我会更正你的代码,以防你想与点之前的所有内容进行比较,或者
如果你想与字符串中的所有内容进行比较。在这种情况下,您也应该使用
%in%
而不是==
。其次,您得到的是一个字符矩阵,它本质上是将行相乘。如果 fe[j] 中的所有元素都是唯一的,您也可以这样做:
假设代码中的逻辑确实有意义(因为您没有应用一些示例数据,这是不可能知道的),优化运行:
请注意- 如果您对 fe[j] 出现在名称中的任何位置感兴趣 - 您可以将代码更改为:
如果这没有返回您想要的内容,那么您的代码也不会执行此操作。我检查了以下示例数据,所有结果都给出了相同的结果:
EDIT : If you only want to delete the columns whose name appear in fe, you can simply do :
This code does exactly what your (updated) for-loop does as well, only a lot more efficient. You don't have to loop through fe, the
%in%
function is vectorized.In case the name can appear anywhere between the dots, then
Your code does some very funny things, and I have no clue what exactly you're trying to do. For one,
strsplit
gives a list, soid == fe[j]
will always return false, unlessfe[j]
is a list itself. And I doubt it is... So I'd correct your code asin case you want to compare with everything that is before the dot, or to
if you want to compare with everything in the string. In that case, you should use
%in%
instead of==
as well.Second, what you get is a character matrix, which essentially multiplies rows. if all elements in fe[j] are unique, you could as well do :
Assuming that the logic in your code does make sense (as you didn't apply some sample data this is impossible to know), the optimalization runs :
Note that - in case you are interested if fe[j] appears anywhere in the name - you can change the code to :
If this doesn't return what you want, then your code doesn't do that either. I checked with following sample data, and all gives the same result :
apply()
系列函数是便利函数。它们不一定比编写良好的 for 循环或向量化函数更快。例如:您的代码如此缓慢的原因是 - 正如 Gavin 指出的 - 您正在为每次循环迭代增加数组。在循环之前预分配整个数组,您将看到显着的加速。
The
apply()
family of functions are convenience functions. They will not necessarily be faster than a well-written for loop or vectorized functions. For example:The reason your code is so slow is that--as Gavin pointed out--you're growing your array for every loop iteration. Preallocate the entire array before the loop and you will see a significant speedup.