使用 d_ply 写入多个自定义文件
这个问题与上一个问题几乎相同,但差异很大,该问题的答案在这里不起作用。就像上一个问题中的@chase一样,我想以以下格式(自定义fasta)为数据帧的每个分割写出多个文件。
#same df as last question
df <- data.frame(
var1 = sample(1:10, 6, replace = TRUE)
, var2 = sample(LETTERS[1:2], 6, replace = TRUE)
, theday = c(1,1,2,2,3,3)
)
#how I want the data to look
write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file="test.txt")
#whole df output looks like this:
#test.txt
>1_A
1
>8_A
1
>4_A
2
>9_A
2
>2_A
3
>1_A
3
但是,我不想从整个数据帧获取输出,而是想为每个数据子集生成单独的文件。使用 d_ply
如下:
d_ply(df, .(theday), function(x) write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file=paste(x$theday,".fasta",sep="")))
我收到以下输出错误:
Error in file(file, ifelse(append, "a", "w")) :
invalid 'description' argument
In addition: Warning messages:
1: In if (file == "") file <- stdout() else if (substring(file, 1L, :
the condition has length > 1 and only the first element will be used
2: In if (substring(file, 1L, 1L) == "|") { :
the condition has length > 1 and only the first element will be used
有关如何解决此问题的任何建议?
谢谢, 扎赫CP
This question is almost the same as a previous question, but differs enough that the answers for that question don't work here. Like @chase in the last question, I want to write out multiple files for each split of a dataframe in the following format(custom fasta).
#same df as last question
df <- data.frame(
var1 = sample(1:10, 6, replace = TRUE)
, var2 = sample(LETTERS[1:2], 6, replace = TRUE)
, theday = c(1,1,2,2,3,3)
)
#how I want the data to look
write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file="test.txt")
#whole df output looks like this:
#test.txt
>1_A
1
>8_A
1
>4_A
2
>9_A
2
>2_A
3
>1_A
3
However, instead of getting the output from the entire dataframe I want to generate individual files for each subset of data. Using d_ply
as follows:
d_ply(df, .(theday), function(x) write(paste(">", df$var1,"_", df$var2, "\n", df$theday, sep=""), file=paste(x$theday,".fasta",sep="")))
I get the following output error:
Error in file(file, ifelse(append, "a", "w")) :
invalid 'description' argument
In addition: Warning messages:
1: In if (file == "") file <- stdout() else if (substring(file, 1L, :
the condition has length > 1 and only the first element will be used
2: In if (substring(file, 1L, 1L) == "|") { :
the condition has length > 1 and only the first element will be used
Any suggestions on how to get around this?
Thanks,
zachcp
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的代码有两个问题。
首先,在构造文件名时,您将向量
x$theday
传递给paste()
。由于x$theday
取自 data.frame 的一列,因此它通常具有多个元素。当您将多个文件名传递给其file=
参数时,您看到的错误是write()
抱怨。使用unique(x$theday)
可确保您只会粘贴一个文件名,而不是多个文件名。其次,您还没有走得足够远,无法看到它,但您可能想写入
x
的内容(data.frame 的当前子集),而不是整个内容df
到每个文件。这是更正后的代码,看起来工作得很好。
There were two problems with your code.
First, in constructing the file name, you passed the vector
x$theday
topaste()
. Sincex$theday
is taken from a column of a data.frame, it often has more than one element. The error you saw waswrite()
complaining when you passed several file names to itsfile=
argument. Using insteadunique(x$theday)
ensures that you will only ever paste together a single file name rather than possibly more than one.Second, you didn't get far enough to see it, but you probably want to write the contents of
x
(the current subset of the data.frame), rather than the entire contents ofdf
to each file.Here is the corrected code, which appears to work just fine.