优化 R 函数,将新列添加到 data.frame

发布于 2024-08-15 20:00:19 字数 859 浏览 2 评论 0原文

我有一个函数,目前在函数模型中进行编程,并且想要加快速度,或者更多地本着 R 的精神解决问题。 我有一个 data.frame,想要根据每个条目依赖于两行的信息添加一列。 目前看起来如下:

faultFinging <- function(heartData){
    if(heartData$Pulse[[1]] == 0){
        Group <- 0
    }
    else{
        Group <- 1
    }
    for(i in seq(2, length(heartData$Pulse), 1)){
        if(heartData$Pulse[[i-1]] != 0 
            && heartData$Pulse[[i]] != 0
            && abs(heartData$Pulse[[i-1]] - heartData$Pulse[[i]])<20){
            Group[[i]] <- 1
        }
        else{
            if(heartData$Pulse[[i-1]] == 0 && heartData$Pulse[[i]] != 0){
                Group[[i]] <- 1
            }
            else{
                Group[[i]] <- 0
            }
        }
    }
    Pulse<-heartData$Pulse
    Time<-heartData$Time
    return(data.frame(Time,Pulse,Group))
}

I have a function that at the moment programmed in a functional model and either want to speed it up and maybe solve the problem more in the spirit of R.
I have a data.frame and want to add a column based on information that's where every entry depends on two rows.
At the moment it looks like the following:

faultFinging <- function(heartData){
    if(heartData$Pulse[[1]] == 0){
        Group <- 0
    }
    else{
        Group <- 1
    }
    for(i in seq(2, length(heartData$Pulse), 1)){
        if(heartData$Pulse[[i-1]] != 0 
            && heartData$Pulse[[i]] != 0
            && abs(heartData$Pulse[[i-1]] - heartData$Pulse[[i]])<20){
            Group[[i]] <- 1
        }
        else{
            if(heartData$Pulse[[i-1]] == 0 && heartData$Pulse[[i]] != 0){
                Group[[i]] <- 1
            }
            else{
                Group[[i]] <- 0
            }
        }
    }
    Pulse<-heartData$Pulse
    Time<-heartData$Time
    return(data.frame(Time,Pulse,Group))
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

江湖彼岸 2024-08-22 20:00:19

如果没有样本数据,我无法对此进行测试,但这是总体思路。您可以使用 &| 完全避免执行 for() 循环,它们是 &&< 的矢量化版本/code> 和 ||。另外,如果只有一个值(true 或 false),则不需要 if-else 语句。

faultFinging <- function(heartData){
    Group <- as.numeric(c(heartData$Pulse[1] != 0,
      (heartData$Pulse[-nrow(heartData)] != 0 
        & heartData$Pulse[-1] != 0
        & abs(heartData$Pulse[-nrow(heartData)] - heartData$Pulse[-1])<20) |
      (heartData$Pulse[-nrow(heartData)] == 0 & heartData$Pulse[-1] != 0)))
    return(cbind(heartData, Group))
}

在索引周围放置 as.numeric() 会将 TRUE 设置为 1,将 FALSE 设置为 0。

I can't test this without sample data, but this is the general idea. You can avoid doing the for() loop entirely by using & and | which are vectorized versions of && and ||. Also, there's no need for an if-else statement if there's only one value (true or false).

faultFinging <- function(heartData){
    Group <- as.numeric(c(heartData$Pulse[1] != 0,
      (heartData$Pulse[-nrow(heartData)] != 0 
        & heartData$Pulse[-1] != 0
        & abs(heartData$Pulse[-nrow(heartData)] - heartData$Pulse[-1])<20) |
      (heartData$Pulse[-nrow(heartData)] == 0 & heartData$Pulse[-1] != 0)))
    return(cbind(heartData, Group))
}

Putting as.numeric() around the index will set TRUE to 1 and FALSE to 0.

缪败 2024-08-22 20:00:19

这可以通过将程序分为两部分以更矢量的方式来完成:首先是一个函数,它采用两个时间样本并确定它们是否满足您的脉冲规范:

isPulse <- function(previous, current)
{ 
  (previous != 0 & current !=0 & (abs(previous-current) < 20)) |
  (previous == 0 & current !=0)
}

请注意使用矢量 | 而不是布尔值<代码>||。

然后调用它,提供两个向量流“先前”和“当前”偏移适当的延迟,在您的情况下,1:

delay <- 1
samples = length(heartData$pulse)

isPulse(heartData$pulse[-(samples-(1:delay))], heartData$pulse[-(1:delay)])

让我们在一些虚构的数据上尝试一下:

sampleData = c(1,0,1,1,4,25,2,0,25,0)
heartData = data.frame(pulse=sampleData)
result = isPulse(heartData$pulse[-(samples-(1:delay))], heartData$pulse[-(1:delay)])

请注意代码 heartData$pulse[ -(samples-(1:delay))] 从末尾修剪 delay 样本,用于前一个流,以及 heartData$pulse[- (1:delay)] 从一开始就修剪当前流的delay 样本。

手动执行,结果应该是(使用 F 表示 false,使用 T 表示 true)

F,T,T,T,F,F,F,T,F

,通过运行它,我们发现它们是!:

> print(result)
FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

成功!< /strong>

由于您希望将它们作为一列绑定回原始数据集中,因此您应该注意,新数组的 delay 元素比原始数据短,因此您需要在开始时填充它带有延迟 FALSE 元素。您可能还想根据您的数据将其转换为 0,1:

resultPadded <- c(rep(FALSE,delay), result)
heartData$result = ifelse(resultPadded, 1, 0)

这给出

> heartData
   pulse result
1      1      0
2      0      0
3      1      1
4      1      1
5      4      1
6     25      0
7      2      0
8      0      0
9     25      1
10     0      0

This can be done in a more vector way by separating your program into two parts: firstly a function which takes two time samples and determines if they meet your pulse specification:

isPulse <- function(previous, current)
{ 
  (previous != 0 & current !=0 & (abs(previous-current) < 20)) |
  (previous == 0 & current !=0)
}

Note the use of vector | instead of boolean ||.

And then invoke it, supplying the two vector streams 'previous' and 'current' offset by a suitable delay, in your case, 1:

delay <- 1
samples = length(heartData$pulse)

isPulse(heartData$pulse[-(samples-(1:delay))], heartData$pulse[-(1:delay)])

Let's try this on some made-up data:

sampleData = c(1,0,1,1,4,25,2,0,25,0)
heartData = data.frame(pulse=sampleData)
result = isPulse(heartData$pulse[-(samples-(1:delay))], heartData$pulse[-(1:delay)])

Note that the code heartData$pulse[-(samples-(1:delay))] trims delay samples from the end, for the previous stream, and heartData$pulse[-(1:delay)] trims delay samples from the start, for the current stream.

Doing it manually, the results should be (using F for false and T for true)

F,T,T,T,F,F,F,T,F

and by running it, we find that they are!:

> print(result)
FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

success!

Since you want to bind these back as a column into your original dataset, you should note that the new array is delay elements shorter than your original data, so you need to pad it at the start with delay FALSE elements. You may also want to convert it into 0,1 as per your data:

resultPadded <- c(rep(FALSE,delay), result)
heartData$result = ifelse(resultPadded, 1, 0)

which gives

> heartData
   pulse result
1      1      0
2      0      0
3      1      1
4      1      1
5      4      1
6     25      0
7      2      0
8      0      0
9     25      1
10     0      0
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文