R中循环中的数据框和替换问题

发布于 2024-12-02 02:44:15 字数 8217 浏览 1 评论 0原文

我在包含行程的数据集上使用 R。每条线路都是一次旅行(从 A 到 B)。在每行中,我都知道个人的身份(一个数字)、旅行的目的(1、2、3 或 4)、时间类别(1、2 或 3)以及识别该旅行的数字。行程已完成(行程是一组行程;所有这些行程都是从 A 到 A)。

我想创建一个新行:对于同一个人,上次旅行在不同旅行中的同一时间类别的目的是什么。该变量称为“prevDistanceSameTimeCategoryDifferentTour”。

我有这个错误:

错误 $<-.data.frame(*tmp*,"prevDistanceSameTimeCategoryDifferentTour", :替换有2行,数据有1167

这是我的代码:

prevPersonTimeCategory <- array(-999, dim=c(3,3))
prevPersonTimeCategory[1,1] <- TgData$PersonID[1]
prevPersonTimeCategory[2,1] <- TgData$PersonID[1]
prevPersonTimeCategory[3,1] <- TgData$PersonID[1]
for(i in 2:nrow(TgData)) {
    if (TgData$timeCategory[i] == 1) {
        if (TgData$tour[i] == prevPersonTimeCategory[1,3]) {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                prevPersonTimeCategory[1,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                prevPersonTimeCategory[1,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[1,2] <- -999
                }
            }
        }
    else if (TgData$timeCategory[i] == 2) {
        if (TgData$tour[i] == prevPersonTimeCategory[2,3]) {
            if (prevPersonTimeCategory[2,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[2,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[2,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[2,1] == TgData$PersonID[i]) {
                print(i)
                prevPersonTimeCategory[2,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[2,2]
                prevPersonTimeCategory[2,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[2,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[2,2] <- -999
                }
            }
        }
    else if (TgData$timeCategory[i] == 3) {
        if (TgData$tour[i] == prevPersonTimeCategory[3,3]) {
            if (prevPersonTimeCategory[3,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[3,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[3,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[3,1] == TgData$PersonID[i]) {
                prevPersonTimeCategory[3,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[3,2]
                prevPersonTimeCategory[3,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[3,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[3,2] <- -999
                }
            }
        }
    else {
        TgData$prevPurposeSameTimeCategoryDifferentTour[i] = -999
        }
    }

我正在创建一个数组来存储每个时间类别的信息。在此数组中,第一个值是个人的身份(prevPersonTimeCategory[1,1]、prevPersonTimeCategory[2,1]、prevPersonTimeCategory[3,1],每个时间类别一个),第二个值是目的(prevPersonTimeCategory[ 1,2]等),第三个是巡演编号(prevPersonTimeCategory[1,3], ETC。)。 然后我只是阅读每一行(for)并写一些条件(if)。

我真的不明白我哪里做错了。

我的数据集包含 36'784 行,但我正在测试 1932 行(-1 行标题)。数据看起来像这样:

PersonID    purpose tour    timeCategory
1   1   1   2
1   4   2   3
1   4   2   3
1   4   3   3
1   3   4   3
1   4   5   3
1   4   5   2
1   4   5   3
1   3   5   3
1   4   6   2
1   4   6   2
1   4   6   3
1   3   7   3
1   4   8   3
1   4   9   3
1   4   10  3
1   4   10  3
1   4   11  1
1   4   12  1
1   4   13  1
1   4   14  1
1   4   16  1
1   1   17  2
1   4   18  3
1   4   19  2
1   3   20  3
1   4   20  3
1   4   21  3
1   1   22  2
1   3   22  3
1   3   23  3
1   4   24  3
1   4   25  3
1   4   25  3
1   4   26  3
1   1   27  2
1   3   27  3
1   4   28  3
1   3   28  3
1   4   29  3
1   4   29  3
1   1   30  2
1   4   31  3
1   1   31  2
1   4   32  3
1   3   32  3
1   4   33  3
1   3   34  3
1   4   35  3
1   1   36  2
1   3   36  3
1   4   37  3
1   3   38  3
1   4   39  3
1   3   39  3
1   4   39  3
1   4   40  3
1   4   40  2
1   4   40  3
1   3   41  3
1   4   42  3
1   4   43  3
1   1   44  2
1   3   45  3
1   4   46  3
1   3   47  3
1   3   47  3
1   4   48  2
1   1   49  2
1   4   50  3
1   1   51  2
1   1   51  2
1   2   51  3
1   3   52  3
1   3   53  1
1   4   54  1
1   4   55  1
1   4   55  1
1   4   55  1
1   1   56  3
1   4   57  3
1   4   58  3
1   1   59  2
1   3   59  3
1   4   60  3
1   4   61  3
1   1   62  3
1   3   63  3
1   4   64  3
1   3   65  3
1   4   66  3
1   3   67  3
1   2   68  1
2   3   69  3
2   1   70  3
2   4   71  2
2   1   72  3
2   3   72  3
2   1   72  2

如果我运行这个简短版本的代码,我没有问题:

prevPersonTimeCategory <- array(-999, dim=c(3,3))
prevPersonTimeCategory[1,1] <- TgData$PersonID[1]
prevPersonTimeCategory[2,1] <- TgData$PersonID[1]
prevPersonTimeCategory[3,1] <- TgData$PersonID[1]
for(i in 2:nrow(TgData)) {
    if (TgData$timeCategory[i] == 1) {
        if (TgData$tour[i] == prevPersonTimeCategory[1,3]) {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                }   
            }
        }
    }

但是如果我再添加几行,如下所示:

prevPersonTimeCategory <- array(-999, dim=c(3,3))
prevPersonTimeCategory[1,1] <- TgData$PersonID[1]
prevPersonTimeCategory[2,1] <- TgData$PersonID[1]
prevPersonTimeCategory[3,1] <- TgData$PersonID[1]
for(i in 2:nrow(TgData)) {
    if (TgData$timeCategory[i] == 1) {
        if (TgData$tour[i] == prevPersonTimeCategory[1,3]) {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                prevPersonTimeCategory[1,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                prevPersonTimeCategory[1,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[1,2] <- -999
                }
            }
        }
    }

错误又回来了:

$<-.data.frame(*tmp*, "prevPurposeSameTimeCategoryDifferentTour", : 替换有 18 行,数据有1150

I'm using R on a dataset containing trips. Each line is a trip (from A to B). On each line, I know the identity of the individual (a number), the purpose of the trip (1,2,3 or 4), the time category (1,2 or 3) and a number identifying the tour in which the trip was done (a tour is a group of trips; all these trips go from A to A).

I would like to create a new row: for the same individual, what was the purpose of the previous trip in the same time category in a different tour. This variable is called "prevDistanceSameTimeCategoryDifferentTour".

I have this error:

Error in
$<-.data.frame(*tmp*,"prevDistanceSameTimeCategoryDifferentTour",
: replacement has 2 rows, data has 1167

Here is my code:

prevPersonTimeCategory <- array(-999, dim=c(3,3))
prevPersonTimeCategory[1,1] <- TgData$PersonID[1]
prevPersonTimeCategory[2,1] <- TgData$PersonID[1]
prevPersonTimeCategory[3,1] <- TgData$PersonID[1]
for(i in 2:nrow(TgData)) {
    if (TgData$timeCategory[i] == 1) {
        if (TgData$tour[i] == prevPersonTimeCategory[1,3]) {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                prevPersonTimeCategory[1,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                prevPersonTimeCategory[1,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[1,2] <- -999
                }
            }
        }
    else if (TgData$timeCategory[i] == 2) {
        if (TgData$tour[i] == prevPersonTimeCategory[2,3]) {
            if (prevPersonTimeCategory[2,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[2,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[2,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[2,1] == TgData$PersonID[i]) {
                print(i)
                prevPersonTimeCategory[2,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[2,2]
                prevPersonTimeCategory[2,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[2,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[2,2] <- -999
                }
            }
        }
    else if (TgData$timeCategory[i] == 3) {
        if (TgData$tour[i] == prevPersonTimeCategory[3,3]) {
            if (prevPersonTimeCategory[3,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[3,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[3,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[3,1] == TgData$PersonID[i]) {
                prevPersonTimeCategory[3,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[3,2]
                prevPersonTimeCategory[3,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[3,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[3,2] <- -999
                }
            }
        }
    else {
        TgData$prevPurposeSameTimeCategoryDifferentTour[i] = -999
        }
    }

I'm creating an array to store information for each time category. In this array, the first value is the identity of the individual (prevPersonTimeCategory[1,1], prevPersonTimeCategory[2,1], prevPersonTimeCategory[3,1], one for each time category), the second is the purpose (prevPersonTimeCategory[1,2], etc.), and the third is the tour number (prevPersonTimeCategory[1,3], etc.).
Then I'm just reading each line (for) and writing a few conditions (if).

I really don't see where I'm doing a mistake.

My dataset contains 36'784 lines, but I'm testing on 1932 lines (-1 line for headers). The data looks like this:

PersonID    purpose tour    timeCategory
1   1   1   2
1   4   2   3
1   4   2   3
1   4   3   3
1   3   4   3
1   4   5   3
1   4   5   2
1   4   5   3
1   3   5   3
1   4   6   2
1   4   6   2
1   4   6   3
1   3   7   3
1   4   8   3
1   4   9   3
1   4   10  3
1   4   10  3
1   4   11  1
1   4   12  1
1   4   13  1
1   4   14  1
1   4   16  1
1   1   17  2
1   4   18  3
1   4   19  2
1   3   20  3
1   4   20  3
1   4   21  3
1   1   22  2
1   3   22  3
1   3   23  3
1   4   24  3
1   4   25  3
1   4   25  3
1   4   26  3
1   1   27  2
1   3   27  3
1   4   28  3
1   3   28  3
1   4   29  3
1   4   29  3
1   1   30  2
1   4   31  3
1   1   31  2
1   4   32  3
1   3   32  3
1   4   33  3
1   3   34  3
1   4   35  3
1   1   36  2
1   3   36  3
1   4   37  3
1   3   38  3
1   4   39  3
1   3   39  3
1   4   39  3
1   4   40  3
1   4   40  2
1   4   40  3
1   3   41  3
1   4   42  3
1   4   43  3
1   1   44  2
1   3   45  3
1   4   46  3
1   3   47  3
1   3   47  3
1   4   48  2
1   1   49  2
1   4   50  3
1   1   51  2
1   1   51  2
1   2   51  3
1   3   52  3
1   3   53  1
1   4   54  1
1   4   55  1
1   4   55  1
1   4   55  1
1   1   56  3
1   4   57  3
1   4   58  3
1   1   59  2
1   3   59  3
1   4   60  3
1   4   61  3
1   1   62  3
1   3   63  3
1   4   64  3
1   3   65  3
1   4   66  3
1   3   67  3
1   2   68  1
2   3   69  3
2   1   70  3
2   4   71  2
2   1   72  3
2   3   72  3
2   1   72  2

If I run this short version of my code, I have no problems:

prevPersonTimeCategory <- array(-999, dim=c(3,3))
prevPersonTimeCategory[1,1] <- TgData$PersonID[1]
prevPersonTimeCategory[2,1] <- TgData$PersonID[1]
prevPersonTimeCategory[3,1] <- TgData$PersonID[1]
for(i in 2:nrow(TgData)) {
    if (TgData$timeCategory[i] == 1) {
        if (TgData$tour[i] == prevPersonTimeCategory[1,3]) {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                }   
            }
        }
    }

But if I add a few more lines like here:

prevPersonTimeCategory <- array(-999, dim=c(3,3))
prevPersonTimeCategory[1,1] <- TgData$PersonID[1]
prevPersonTimeCategory[2,1] <- TgData$PersonID[1]
prevPersonTimeCategory[3,1] <- TgData$PersonID[1]
for(i in 2:nrow(TgData)) {
    if (TgData$timeCategory[i] == 1) {
        if (TgData$tour[i] == prevPersonTimeCategory[1,3]) {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                }   
            }
        else {
            if (prevPersonTimeCategory[1,1] == TgData$PersonID[i]) {
                prevPersonTimeCategory[1,3] <- TgData$tour[i]
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- prevPersonTimeCategory[1,2]
                prevPersonTimeCategory[1,2] <- TgData$purpose[i]
                }
            else {
                TgData$prevPurposeSameTimeCategoryDifferentTour[i] <- -999
                prevPersonTimeCategory[1,1] <- TgData$PersonID[i]
                prevPersonTimeCategory[1,2] <- -999
                }
            }
        }
    }

The error comes back:

Error in $<-.data.frame(*tmp*,
"prevPurposeSameTimeCategoryDifferentTour", : replacement has 18
rows, data has 1150

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

真心难拥有 2024-12-09 02:44:15

按照 joran 建议的方式创建一个新的空列。

在开始循环之前运行此命令

TgData$prevPurposeSameTimeCategoryDifferentTour <- NA

Creating a new empty column as joran suggested works.

run this before you start the loop

TgData$prevPurposeSameTimeCategoryDifferentTour <- NA

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文