R:将一个数据框中的列添加到另一个数据框架,不匹配的行数

发布于 2025-01-30 16:05:49 字数 1000 浏览 1 评论 0原文

我有一个.txt文件,其中包含数百万行数据 - 日期时间(1分钟间隔)和降水量。

我有一个.CSV文件,其中包含数千行数据 - DateTime(每日插入),Maxtemp,Mintemp,WindSPD,Winddir。

我将.TXT文件导入数据框架并进行一些转换。然后,我将其移至新的数据框架中。

我将.CSV文件作为数据框进行导入一些转换。然后,我想将此数据框中的列添加到新的数据框架中(总计7列)。但是,r引发错误:“ data.frame中的错误(...,check.names = false):参数暗示行数不同:10382384,32868,1

我知道我知道排是不同的,但是,这是我需要采取下一步处理的格式。如果不是为了疯狂的行,这可以在Excel中很容易完成。

下面的仿真代码在下面产生相同的错误:

a <- as.character(c(1,2,3,4,5,6,7,8,9,10))
b <- c(paste("Date", a))
c <- c(rnorm(10, mean = 5, sd = 2.1))
Frame1 <- data.frame(b,c)

d <- as.character(c(1,2,3))
e <- c(paste("Date", d))
f <- c(rnorm(3, mean = 1, sd = 0.7))
g <- c(rnorm(3, mean = 3, sd = 2))
h <- c(rnorm(3, mean = 8, sd = 1))
Frame2 <- data.frame(e,f,g,h)

NewFrame <- cbind(Frame1)

NewFrame <- cbind(NewFrame, Frame2)

我尝试了a *_ join,但会引发错误:“ 错误:by必须在x和<时提供。代码> y 没有常见变量。 我使用= targin()`要执行一个交叉加入。旁边的下一个处理步骤。

I have a .txt file with millions of rows of data - DateTime (1-min intervals) and Precipitation.

I have a .csv file with thousands of rows of data - DateTime (daily intevals), MaxTemp, MinTemp, WindSpd, WindDir.

I import the .txt file as a data frame and do a few transformations. I then move this into a new data frame.

I import the .csv file as a data frame do a few transformations. I then want to add the columns from this data frame into the new data frame (total of 7 columns). However, R throws an error: "Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 10382384, 32868, 1"

I know the number of rows is different, however, this is the format I need for the next step in processing. This could be easily done in Excel were it not for the crazy amount of rows.

Simulated code is below, which produces the same error:

a <- as.character(c(1,2,3,4,5,6,7,8,9,10))
b <- c(paste("Date", a))
c <- c(rnorm(10, mean = 5, sd = 2.1))
Frame1 <- data.frame(b,c)

d <- as.character(c(1,2,3))
e <- c(paste("Date", d))
f <- c(rnorm(3, mean = 1, sd = 0.7))
g <- c(rnorm(3, mean = 3, sd = 2))
h <- c(rnorm(3, mean = 8, sd = 1))
Frame2 <- data.frame(e,f,g,h)

NewFrame <- cbind(Frame1)

NewFrame <- cbind(NewFrame, Frame2)

I have tried a *_join but it throws error: "Error: by must be supplied when x and y have no common variables.
i use by = character()` to perform a cross-join.
" which to me reads like it wants to match things up, which I don't need. I really just need to plop these two datasets side-by-side for the next processing step. Help?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无法回应 2025-02-06 16:05:49

数据帧必须具有相等数量的行。为了补偿然后,我只是在较小的数据集中添加了一堆行,以便它包含与较大数据集相同的行数(在我的情况下,它始终是.csv文件)并用“ NA”值填充了它。我用于下游处理的以下应用程序知道如何处理“ NA”值,因此这对我来说很好。

我已经使用代表性数据集运行解决方案,并且能够将两个数据帧一起使用。

带有模拟数据集的示例代码:

#create data frame 1
a <- as.character(c(1:10))
b <- c(paste("Date", a))
c <- c(rnorm(10, mean = 5, sd = 2.1))
Frame1 <- data.frame(b,c)

#create date frame 2
d <- as.character(c(1,2,3))
e <- c(paste("Date", d))
f <- c(rnorm(3, mean = 1, sd = 0.7))
g <- c(rnorm(3, mean = 3, sd = 2))
h <- c(rnorm(3, mean = 8, sd = 1))
Frame2 <- data.frame(e,f,g,h)

#find the maximum number of rows
maxlen <- max(nrow(Frame1), nrow(Frame2))

#finds the minimum number of rows
rowrow <- min(nrow(Frame1), nrow(Frame2))

#adds enough rows to the smaller dataset to equal the number of rows
#in the larger dataset. Populates the rows with "NA" values
Frame2[rowrow+(maxlen-rowrow),] <- NA

#creates the new data frame from the two frames
NewFrame <- cbind(NewFrame, Frame2)

The data frames MUST have an equal number of rows. To compensate then, I just added a bunch of rows to the smaller dataset so that it contains the same amount of rows as the larger dataset (in my case, it will always be the .csv file) and filled it with "NA" values. The following application I use for downstream processing knows how to handle the "NA" values so this works well for me.

I've run the solution with a representative dataset and I am able to cbind the two data frames together.

Sample code with the simulated dataset:

#create data frame 1
a <- as.character(c(1:10))
b <- c(paste("Date", a))
c <- c(rnorm(10, mean = 5, sd = 2.1))
Frame1 <- data.frame(b,c)

#create date frame 2
d <- as.character(c(1,2,3))
e <- c(paste("Date", d))
f <- c(rnorm(3, mean = 1, sd = 0.7))
g <- c(rnorm(3, mean = 3, sd = 2))
h <- c(rnorm(3, mean = 8, sd = 1))
Frame2 <- data.frame(e,f,g,h)

#find the maximum number of rows
maxlen <- max(nrow(Frame1), nrow(Frame2))

#finds the minimum number of rows
rowrow <- min(nrow(Frame1), nrow(Frame2))

#adds enough rows to the smaller dataset to equal the number of rows
#in the larger dataset. Populates the rows with "NA" values
Frame2[rowrow+(maxlen-rowrow),] <- NA

#creates the new data frame from the two frames
NewFrame <- cbind(NewFrame, Frame2)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文