将两个数据框逐个元素粘贴在一起

发布于 2024-11-16 08:35:51 字数 1120 浏览 1 评论 0原文

我需要逐个元素粘贴两个数据帧的内容以输入到另一个程序。我有一个平均值数据框和一个平均值标准误差数据框。

我尝试使用 R Paste() 函数，但它似乎无法处理数据框。使用向量时，似乎将第一个向量的所有元素连接成一个字符串，并将第二个向量的所有元素连接成一个单独的字符串。相反，我需要将两个数据框中的每个倒数元素连接在一起。

对于如何解决这个问题有什么建议吗？我已经包含了虚拟输入数据（datMean 和 datSE）和我想要的输出（datNew）。我的真实数据框大小约为 10 行 x 150 列。

# means and SEM
datMean <- data.frame(a=rnorm(10, 3), b=rnorm(10, 3), d=rnorm(10, 3))
datSE <- data.frame(a=rnorm(10, 3)/100, b=rnorm(10, 3)/100, d=rnorm(10, 3)/100)

# what the output should look like
# i've chosen some arbitrary values here, and show only the first row. 
datNew <- data.frame(a="2.889-2.926", b="1.342-1.389", d="2.569-2.576")

这个想法是 datNew 中的每个元素都是由 'mean - se' 和 'mean + se' 组成的范围，并用破折号 '-' 分隔。 Paste() 函数可以对一个元素执行此操作，如何在整个数据帧上执行此操作？

paste(datMean[1,1] - datSE[1,1], datMean[1,1] + datSE[1,1], sep="-")

编辑1： 看了一些答案，我意识到我在问题中遗漏了一些重要的信息。原始数据帧的每一行都被命名，我需要用这些名称重建最终的数据帧。例如：

rownames(datMean) <- LETTERS[1:10]
rownames(datSE) <- LETTERS[1:10]

我需要 datNew 最终再次拥有这 10 个行名。对于某些使用 Melt() 的解决方案来说，这可能会出现问题。

原文

I need to paste, element by element, the contents of two data frames for input to another program. I have a data frame of means and a data frame of standard errors of the mean.

I tried using the R paste() function, but it doesn't seem to be able to cope with data frames. When using a vector, it seems to concatenate all the elements of the first vector into a string and all the elements of the second into a separate string. Instead, I need each reciprocal element in the two data frames to be concatenated together.

Any suggestions for how to approach this? I've included dummy input data (datMean and datSE) and my desired output (datNew). My real data frames are about 10 rows by 150 columns in size.

# means and SEM
datMean <- data.frame(a=rnorm(10, 3), b=rnorm(10, 3), d=rnorm(10, 3))
datSE <- data.frame(a=rnorm(10, 3)/100, b=rnorm(10, 3)/100, d=rnorm(10, 3)/100)

# what the output should look like
# i've chosen some arbitrary values here, and show only the first row. 
datNew <- data.frame(a="2.889-2.926", b="1.342-1.389", d="2.569-2.576")

The idea is for each element in datNew to be a range consisting of 'mean - se' and 'mean + se', separated by a dash '-' . The paste() function can do this for one element, how to do this over the whole dataframe?

paste(datMean[1,1] - datSE[1,1], datMean[1,1] + datSE[1,1], sep="-")

EDIT 1:
Looking at some of the answers I realize I left out an important bit of information in the question. Each row of the original data frames is named, and I need to reconstitute the final data frame with these names. For example:

rownames(datMean) <- LETTERS[1:10]
rownames(datSE) <- LETTERS[1:10]

I need datNew to eventually have these 10 rownames again. This could be problematic with some of the solutions using melt().

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

风尘浪孓 2024-11-23 08:35:51

如果您首先转换为矩阵，则根本不需要应用或循环即可完成。

MdatMean <- as.matrix(datMean)
MdatSE <- as.matrix(datSE)
matrix( paste(MdatMean - MdatSE, MdatMean + MdatSE, sep="-"), 
        nrow=nrow(MdatMean), dimnames=dimnames(MdatMean) )

您还可以考虑 formatC 以获得更好的格式设置。

lo <- formatC(MdatMean - MdatSE, format="f", digits=3)
hi <- formatC(MdatMean + MdatSE, format="f", digits=3)
matrix( paste(lo, hi, sep="-"), 
        nrow=nrow(MdatMean), dimnames=dimnames(MdatMean) )

如果您希望最后有一个 data.frame ，只需将最后一行包裹在 as.data.frame 中即可。

If you convert to matrices first, you can do it with no applies or loops at all.

MdatMean <- as.matrix(datMean)
MdatSE <- as.matrix(datSE)
matrix( paste(MdatMean - MdatSE, MdatMean + MdatSE, sep="-"), 
        nrow=nrow(MdatMean), dimnames=dimnames(MdatMean) )

You also might consider formatC for better formatting.

lo <- formatC(MdatMean - MdatSE, format="f", digits=3)
hi <- formatC(MdatMean + MdatSE, format="f", digits=3)
matrix( paste(lo, hi, sep="-"), 
        nrow=nrow(MdatMean), dimnames=dimnames(MdatMean) )

If you want a data.frame in the end just wrap the last line in as.data.frame.

回复收藏 0 原文

愁以何悠 2024-11-23 08:35:51

这是一种无需手动指定每一列即可执行此操作的方法。首先，我们创建数据并使用abind包将它们放入数组中，四舍五入为3，因为这样看起来更好：

datMean <- data.frame(a=rnorm(10, 3), b=rnorm(10, 3), d=rnorm(10, 3))
datSE <- data.frame(a=rnorm(10, 3)/100, b=rnorm(10, 3)/100, d=rnorm(10, 3)/100)

library(abind)

datArray <- round(abind(datMean,datSE,along=3),3)

然后我们可以将paste函数应用于每个元素和列该数组的：

apply(datArray,1:2,function(x)paste(x[1]-x[2],"-",x[1]+x[2]))

      a               b               d              
 [1,] "3.537 - 3.581" "3.358 - 3.436" "3.282 - 3.312"
 [2,] "2.452 - 2.516" "1.372 - 1.44"  "3.041 - 3.127"
 [3,] "3.017 - 3.101" "3.14 - 3.228"  "5.238 - 5.258"
 [4,] "3.397 - 3.451" "2.783 - 2.839" "3.381 - 3.405"
 [5,] "1.918 - 1.988" "2.978 - 3.02"  "3.44 - 3.504" 
 [6,] "4.01 - 4.078"  "3.014 - 3.068" "1.914 - 1.954"
 [7,] "3.475 - 3.517" "2.117 - 2.159" "1.871 - 1.929"
 [8,] "2.551 - 2.619" "3.907 - 3.975" "1.588 - 1.614"
 [9,] "1.707 - 1.765" "2.63 - 2.678"  "1.316 - 1.348"
[10,] "4.051 - 4.103" "3.532 - 3.628" "3.235 - 3.287"

Here is a way to do this without manually specifying each column. First we make the data and put them in an array using the abind package, rounding to 3 because that looks better:

datMean <- data.frame(a=rnorm(10, 3), b=rnorm(10, 3), d=rnorm(10, 3))
datSE <- data.frame(a=rnorm(10, 3)/100, b=rnorm(10, 3)/100, d=rnorm(10, 3)/100)

library(abind)

datArray <- round(abind(datMean,datSE,along=3),3)

Then we can apply the paste function to each element and column of this array:

apply(datArray,1:2,function(x)paste(x[1]-x[2],"-",x[1]+x[2]))

      a               b               d              
 [1,] "3.537 - 3.581" "3.358 - 3.436" "3.282 - 3.312"
 [2,] "2.452 - 2.516" "1.372 - 1.44"  "3.041 - 3.127"
 [3,] "3.017 - 3.101" "3.14 - 3.228"  "5.238 - 5.258"
 [4,] "3.397 - 3.451" "2.783 - 2.839" "3.381 - 3.405"
 [5,] "1.918 - 1.988" "2.978 - 3.02"  "3.44 - 3.504" 
 [6,] "4.01 - 4.078"  "3.014 - 3.068" "1.914 - 1.954"
 [7,] "3.475 - 3.517" "2.117 - 2.159" "1.871 - 1.929"
 [8,] "2.551 - 2.619" "3.907 - 3.975" "1.588 - 1.614"
 [9,] "1.707 - 1.765" "2.63 - 2.678"  "1.316 - 1.348"
[10,] "4.051 - 4.103" "3.532 - 3.628" "3.235 - 3.287"

回复收藏 0 原文

Bonjour°[大白 2024-11-23 08:35:51

这是我对你的问题的理解。我使用 reshape2::melt 将均值和 SE 的数据从多列融合到一列。

library(reshape2)
datMean <- melt(datMean)$value
datSE <- melt(datSE)$value
dat <- cbind(datMean, datSE)

apply(X = dat, MARGIN = 1, FUN = function(x) {
            paste(x[1] - x[2], x[1] + x[2], sep = " - ")
        })

结果

 [1] "3.03886802467251 - 3.08551547263516" 
 [2] "3.01803172559258 - 3.05247871975711" 
 [3] "3.4609230722069 - 3.56097173966387"  
 [4] "1.35368243309618 - 1.45548512578821" 
 [5] "2.39936853846605 - 2.47570756724791" 
 [6] "3.21849170272184 - 3.29653660329785"

编辑

此解决方案尊重您的原始数据维度。我所做的是创建一个 3D 数组，并在保持第三维 ([x,y, 1:2]) 不变的情况下一次处理每个单元格。

dat <- array(c(datMean, datSE), dim = c(10, 3, 2))

datNEW <- matrix(rep(NA, nrow(dat)*ncol(dat)), ncol = ncol(dat))

for (column in seq(ncol(dat))) {
    cls <- rep(NA, nrow(dat))
    for (rows in seq(nrow(dat))) {
        tmp <- dat[rows, column, 1:2]
        cls[rows] <- paste(tmp[1] - tmp[2], tmp[1] + tmp[2], sep = " - ")
    }
    datNEW[, column] <- cls
}

Here's how I understand your problem. I melted the data for means and SE from multiple columns to one column using reshape2::melt.

library(reshape2)
datMean <- melt(datMean)$value
datSE <- melt(datSE)$value
dat <- cbind(datMean, datSE)

apply(X = dat, MARGIN = 1, FUN = function(x) {
            paste(x[1] - x[2], x[1] + x[2], sep = " - ")
        })

And the result

 [1] "3.03886802467251 - 3.08551547263516" 
 [2] "3.01803172559258 - 3.05247871975711" 
 [3] "3.4609230722069 - 3.56097173966387"  
 [4] "1.35368243309618 - 1.45548512578821" 
 [5] "2.39936853846605 - 2.47570756724791" 
 [6] "3.21849170272184 - 3.29653660329785"

EDIT

This solution respects your original data dimensions. What I do is make a 3D array and work on each cell at a time with holding the third dimension ([x,y, 1:2]) constant.

dat <- array(c(datMean, datSE), dim = c(10, 3, 2))

datNEW <- matrix(rep(NA, nrow(dat)*ncol(dat)), ncol = ncol(dat))

for (column in seq(ncol(dat))) {
    cls <- rep(NA, nrow(dat))
    for (rows in seq(nrow(dat))) {
        tmp <- dat[rows, column, 1:2]
        cls[rows] <- paste(tmp[1] - tmp[2], tmp[1] + tmp[2], sep = " - ")
    }
    datNEW[, column] <- cls
}

回复收藏 0 原文

昇り龍 2024-11-23 08:35:51

您可以一次对每一行执行此操作，但您要应用于两个 data.frame 之间的配对列。由于您每次都有特定的粘贴工作要做，因此定义函数：

pfun <- function(x, y) paste(x - y, x + y, sep = "-")

然后使用该函数构造新的 data.frame：

 datNew <- data.frame(a = pfun(datMean$a, datSE$a), b = pfun(datMean$b, datSE$b), d = pfun(datMean$d, datSE$d))

会有更简洁的方法来应用它，但这也许可以帮助您更好地理解。您可以传递整列进行粘贴，但不能传递整个 data.frames。

使用循环来匹配结果中的所有列，而无需单独指定它们。

首先创建一个列表来存储所有列，我们将转换为具有正确列名的 data.frame 。

datNew <- vector("list", ncol(datMean))

命名确实假设列号、名称和顺序在两个输入 data.frame 之间完全匹配。

names(datNew) <- names(datMean)

for (i in 1:ncol(datMean)) {
    datNew[[i]] <- pfun(datMean[[i]], datSE[[i]])
}

转换为data.frame：

datNew <- as.data.frame(datNew)

You can do this on every row at once, but you are applying to paired columns between two data.frames. Since you have a specific paste job to do each time, define the function:

pfun <- function(x, y) paste(x - y, x + y, sep = "-")

and then construct the new data.frame with the function:

 datNew <- data.frame(a = pfun(datMean$a, datSE$a), b = pfun(datMean$b, datSE$b), d = pfun(datMean$d, datSE$d))

There would be terser ways to apply this, but perhaps that helps you understand better. You can pass whole columns to paste, but not whole data.frames.

Use a loop to match all columns in the result without specifying them individually.

First create a list to store all the columns, we will convert to data.frame with the right column names.

datNew <- vector("list", ncol(datMean))

The naming does assume that column number, names and order are an exact match between the two input data.frames.

names(datNew) <- names(datMean)

for (i in 1:ncol(datMean)) {
    datNew[[i]] <- pfun(datMean[[i]], datSE[[i]])
}

Convert to data.frame:

datNew <- as.data.frame(datNew)

回复收藏 0 原文

泼猴你往哪里跑 2024-11-23 08:35:51

使用mapply来粘贴和cbind来保留行名：

x <- cbind(
  datMean[, 0],
  mapply(paste, round(datMean - datSE, 3), round(datMean + datSE, 3), sep = " - "))

x
#               a             b             d
# A 3.268 - 3.321 5.226 - 5.308   2.3 - 2.358
# B 3.795 - 3.874 1.772 - 1.833 2.265 - 2.335
# C 1.305 - 1.346 1.238 - 1.291 2.812 - 2.874
# D 1.957 - 2.041 3.016 - 3.057 2.402 - 2.473
# E  4.73 - 4.786 2.909 - 2.963 2.245 - 2.297
# F 3.511 - 3.554 3.547 - 3.603 2.316 - 2.374
# G 3.601 - 3.689 3.073 - 3.144 3.145 - 3.215
# H 2.056 - 2.118  2.597 - 2.69  2.58 - 2.627
# I 1.802 - 1.835 2.794 - 2.895   2.452 - 2.5
# J 2.399 - 2.461 1.807 - 1.844 3.199 - 3.254

class(x)
# [1] "data.frame"
identical(rownames(datMean), rownames(x))
# [1] TRUE

Using mapply to paste and cbind to keep rownames:

x <- cbind(
  datMean[, 0],
  mapply(paste, round(datMean - datSE, 3), round(datMean + datSE, 3), sep = " - "))

x
#               a             b             d
# A 3.268 - 3.321 5.226 - 5.308   2.3 - 2.358
# B 3.795 - 3.874 1.772 - 1.833 2.265 - 2.335
# C 1.305 - 1.346 1.238 - 1.291 2.812 - 2.874
# D 1.957 - 2.041 3.016 - 3.057 2.402 - 2.473
# E  4.73 - 4.786 2.909 - 2.963 2.245 - 2.297
# F 3.511 - 3.554 3.547 - 3.603 2.316 - 2.374
# G 3.601 - 3.689 3.073 - 3.144 3.145 - 3.215
# H 2.056 - 2.118  2.597 - 2.69  2.58 - 2.627
# I 1.802 - 1.835 2.794 - 2.895   2.452 - 2.5
# J 2.399 - 2.461 1.807 - 1.844 3.199 - 3.254

class(x)
# [1] "data.frame"
identical(rownames(datMean), rownames(x))
# [1] TRUE

回复收藏 0 原文

~没有更多了~