R:如何将这个特殊文件保存为csv文件?
我的输入文件从普通的 csv 表开始。
x <- read.table(textConnection(
+ ' models cores time
+ aa c1 xxx|yyy
+ aa c2 xxx|zzz
+ aa c3 www
+ aa c4 xxx|vvv
+ bb c1 vvv|www
+ bb c2 www|qqq
+ bb c3 xxx|uuu
+ bb c4 uuu' ), header=TRUE)
这是一个以因子为所有条目的文件,如下所示:
> str(x)
'data.frame': 8 obs. of 3 variables:
$ models: Factor w/ 2 levels "aa","bb": 1 1 1 1 2 2 2 2
$ cores : Factor w/ 4 levels "c1","c2","c3",..: 1 2 3 4 1 2 3 4
$ time : Factor w/ 8 levels "uuu","vvv|www",..: 7 8 3 6 2 4 5 1
为了使用命令“strsplit”分割最后一列,我参考之前发布的问题完成了以下步骤。
> write.csv(x, file="x.csv")
> y <- read.csv(file="x.csv",header=TRUE,stringsAsFactors=FALSE)
> str(y)
'data.frame': 8 obs. of 4 variables:
$ X : int 1 2 3 4 5 6 7 8
$ models: chr "aa" "aa" "aa" "aa" ...
$ cores : chr "c1" "c2" "c3" "c4" ...
$ time : chr "xxx|yyy" "xxx|zzz" "www" "xxx|vvv" ...
Warning messages:
1: closing unused connection 4 (" models cores time \naa c1 xxx|yyy \naa c2 xxx|zzz \naa c3 www \naa c4 xxx|vvv \nbb c1 vvv|www \nbb c2 www|qqq \nbb c3 xxx|uuu \nbb c4 uuu")
2: closing unused connection 3 (" models cores time \n4 1 0.000365 \n4 2 0.000259 \n4 3 0.000239 \n4 4 0.000220 \n8 1 0.000259 \n8 2 0.000249 \n8 3 0.000251 \n8 4 0.000258")
> df2 <- as.data.frame(
+ t(
+ do.call(cbind,
+ lapply(1:nrow(y),function(x){
+ sapply(unlist(strsplit(y[x,4],"\\|")),c,y[x,2:3],USE.NAMES=FALSE)
+ }) ) ) )
> str(df2)
结果就是我需要的。
> df2
V1 models cores
1 xxx aa c1
2 yyy aa c1
3 xxx aa c2
4 zzz aa c2
5 www aa c3
6 xxx aa c4
7 vvv aa c4
8 vvv bb c1
9 www bb c1
10 www bb c2
11 qqq bb c2
12 xxx bb c3
13 uuu bb c3
14 uuu bb c4
当我输入 str(df2) 时,我发现所有条目都是 chr 列表:
'data.frame': 14 obs. of 3 variables:
$ V1 :List of 14
..$ : chr "xxx"...
$ models:List of 14
..$ : chr "aa"
..$ : chr "aa"
$ models:List of 14
..$ : chr "aa"
..$ : chr "aa"
但是,我很难再次将最终结果保存为 csv 表。
> write.csv(df2, file="df2.csv")
Error in write.table(x, file, nrow(x), p, rnames, sep, eol, na, dec, as.integer(quote), :
unimplemented type 'list' in 'EncodeElement'
如何再次将 df2 文件保存为 csv 格式?请帮忙。
My input file start from an ordinary csv table.
x <- read.table(textConnection(
+ ' models cores time
+ aa c1 xxx|yyy
+ aa c2 xxx|zzz
+ aa c3 www
+ aa c4 xxx|vvv
+ bb c1 vvv|www
+ bb c2 www|qqq
+ bb c3 xxx|uuu
+ bb c4 uuu' ), header=TRUE)
It is a file with factor as all entry, as shown in following:
> str(x)
'data.frame': 8 obs. of 3 variables:
$ models: Factor w/ 2 levels "aa","bb": 1 1 1 1 2 2 2 2
$ cores : Factor w/ 4 levels "c1","c2","c3",..: 1 2 3 4 1 2 3 4
$ time : Factor w/ 8 levels "uuu","vvv|www",..: 7 8 3 6 2 4 5 1
In order to split the last column with the command "strsplit", I have done the following steps with reference to previous questions posted.
> write.csv(x, file="x.csv")
> y <- read.csv(file="x.csv",header=TRUE,stringsAsFactors=FALSE)
> str(y)
'data.frame': 8 obs. of 4 variables:
$ X : int 1 2 3 4 5 6 7 8
$ models: chr "aa" "aa" "aa" "aa" ...
$ cores : chr "c1" "c2" "c3" "c4" ...
$ time : chr "xxx|yyy" "xxx|zzz" "www" "xxx|vvv" ...
Warning messages:
1: closing unused connection 4 (" models cores time \naa c1 xxx|yyy \naa c2 xxx|zzz \naa c3 www \naa c4 xxx|vvv \nbb c1 vvv|www \nbb c2 www|qqq \nbb c3 xxx|uuu \nbb c4 uuu")
2: closing unused connection 3 (" models cores time \n4 1 0.000365 \n4 2 0.000259 \n4 3 0.000239 \n4 4 0.000220 \n8 1 0.000259 \n8 2 0.000249 \n8 3 0.000251 \n8 4 0.000258")
> df2 <- as.data.frame(
+ t(
+ do.call(cbind,
+ lapply(1:nrow(y),function(x){
+ sapply(unlist(strsplit(y[x,4],"\\|")),c,y[x,2:3],USE.NAMES=FALSE)
+ }) ) ) )
> str(df2)
The result is what I needed.
> df2
V1 models cores
1 xxx aa c1
2 yyy aa c1
3 xxx aa c2
4 zzz aa c2
5 www aa c3
6 xxx aa c4
7 vvv aa c4
8 vvv bb c1
9 www bb c1
10 www bb c2
11 qqq bb c2
12 xxx bb c3
13 uuu bb c3
14 uuu bb c4
When I type str(df2), I found all entry is a list of chr:
'data.frame': 14 obs. of 3 variables:
$ V1 :List of 14
..$ : chr "xxx"...
$ models:List of 14
..$ : chr "aa"
..$ : chr "aa"
$ models:List of 14
..$ : chr "aa"
..$ : chr "aa"
However, I have difficulty to save this final results as csv table again.
> write.csv(df2, file="df2.csv")
Error in write.table(x, file, nrow(x), p, rnames, sep, eol, na, dec, as.integer(quote), :
unimplemented type 'list' in 'EncodeElement'
How can I save the df2 file again in csv format? Pls help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你正在做的事情看起来非常愚蠢——为什么要把一些东西写到 CSV 中然后再读回来呢? - 但鉴于
df2
大致是您想要的,您需要unlist()
df2
中的三个组件> 并投射回数据框。这给了我们:
可以读出并再次输入:
更新:
直接从
x
转到所需的输出会更简单,而不是将其读取到 CSV 并再次返回,然后处理y
。例如,这从x
直接转到与上面的out
相同的结果:这给出:
Aside:
顺便说一句,您无法让我们轻松粘贴您的代码,因为您只是从 R 控制台复制代码,因此它包含提示 (
+
)。相反,您可以执行dput(x)
并将其粘贴到您的Q中:那么我们都可以简单地执行:
与调用create
df2
相同。这本来是更好的:这样我们就可以简单地重建你拥有的对象和你尝试过的东西。
What you are doing seems epically silly - why write something out to CSV just to read back in again? - but given that
df2
is roughly how you want it, you need tounlist()
the three components indf2
and cast back as a data frame.That gives us:
Which can be read out and in again:
Update:
It would be simpler to go straight from
x
to the desired output instead of reading it out to CSV and back in again and then processingy
. For example this goes fromx
directly to the same result asout
from above:Which gives:
Aside:
As an aside, you don't make it easy for us to paste in your code as you just copied it from the R Console so it includes the prompts (
+
). Instead you could have donedput(x)
and pasted that into your Q:then we could all have simply done:
Same with the call to create
df2
. This would have been preferable:That way it is a simple matter for us to reconstruct what objects you have and what you tried.
编辑 - 示例数据
EDIT - example data