分割范围
假设我有一些由起始坐标 start<-c(1,2,3)
和结束坐标 end<-c(4,5,4) ;ranges<-data 表示的范围。帧(开始,结束)
我怎样才能把它分成一个长度的间隔? 即我希望
将其
starts ends
1 1 4
2 2 5
3 3 4
转换为这样:
starts ends
1 1 2 |
2 3 4 <-end of original first interval
3 2 3 |
4 4 5 <-end of original second interval
5 3 4 <-end of original third interval
现在我有一个 for 循环迭代列表并创建一个从开始到结束的序列序列,但此循环需要很长时间才能执行长范围列表。
Say I have some ranges represented by start coordinates start<-c(1,2,3)
and end coordiantes end<-c(4,5,4) ;ranges<-data.frame(start,end)
How can I split this up into one length intervals?
i.e. I want
this
starts ends
1 1 4
2 2 5
3 3 4
to be transformed into this:
starts ends
1 1 2 |
2 3 4 <-end of original first interval
3 2 3 |
4 4 5 <-end of original second interval
5 3 4 <-end of original third interval
right now I have a for loop iterating through the list and creating a sequence sequence that goes from start to end but this loop takes a very long time to execute for long lists of ranges.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是一种方法。这是一个“美化的 for 循环”,以序列上的
lapply
为伪装。这给出了正确的结果:
然后在更大的问题上计时:
在我的机器上这大约需要 1.6 秒。够好吗?
...诀窍是直接处理向量而不是数据框。然后最后构建 data.frame 。
更新 @Ellipsis... 评论说
lapply
并不比 for 循环好。让我们看看:因此,在这种情况下,for 循环不仅慢了约 12%,而且也更加冗长......
再次更新!
@Martin Morgan 建议使用
Map,它确实是迄今为止最快的解决方案 - 比我的其他答案中的
do.call
更快。另外,通过使用 seq.int 我的第一个解决方案也更快:Here's one way. It's a "glorified for-loop" in the disguise of
lapply
on a sequence.Which gives the correct result:
And then time it on a bigger problem:
This takes about 1.6 seconds on my machine. Good enough?
...The trick is to work on the vectors directly instead of on the data.frame. And then build the data.frame at the end.
Update @Ellipsis... commented that
lapply
is no better than a for-loop. Let's see:So, not only is the for-loop about 12% slower in this case, it is also much more verbose...
UPDATE AGAIN!
@Martin Morgan suggested using
Map
, and it is indeed the fastest solution yet - faster thando.call
in my other answer. Also, by usingseq.int
my first solution is also much faster:您可以尝试为向量、
parse
-ing 和eval
-uating 创建文本,然后使用matrix
创建数据。框架:
You could try creating text for the vectors,
parse
-ing andeval
-uating and then using amatrix
to create thedata.frame
:这是基于 @James 伟大解决方案的另一个答案。它避免了粘贴和解析,并且速度更快一些:
计时:
Here's another answer based on @James great solution. It avoids paste and parse and is a little bit faster:
Timing it: