循环遍历文件并使用函数，然后使用该结果在 r 中形成数据帧

发布于 2025-01-14 23:41:39 字数 849 浏览 2 评论 0原文

我有一个已排序的 bam 文件目录，我想对其使用 pileup 函数。 pileup 函数的输出是一个数据帧。然后我想使用每个文件的结果并形成一个数据框。

对于每个文件，我使用以下代码：

r16<-pileup(filename, index=filename, scanBamParam = ScanBamParam(), pileupParam = PileupParam())
r16$sample_id <- "sample id"

对于 sample_id 列，我希望它是文件的名称，例如：

文件名是 file1.sorted.bam，我希望 sample_id 为 file1

处理完所有文件后，我将使用 rbind 获取一个大数据帧并保存它到 RData 文件。

到目前为止，我已经尝试在它们上使用循环，但它没有给我任何输出。

library(pasillaBamSubset)
library(Rsamtools)
filenames<-Sys.glob("*.sorted.bam")
for (file in filenames) {
  output <- pileup(pileup(filenames, index=filenames, scanBamParam = ScanBamParam(), pileupParam = PileupParam()))
  save(output, file = "res.RData")
}

原文

I have a directory of sorted bam files that I want to use pileup function to. The output of pileup function is a dataframe. Then I would like to use the result of each file and form a dataframe.

For each file, I use the follow codes:

r16<-pileup(filename, index=filename, scanBamParam = ScanBamParam(), pileupParam = PileupParam())
r16$sample_id <- "sample id"

For sample_id column, I would like it to be the name of the file, for example:

the file name is file1.sorted.bam, I would like sample_id to be file1

And after all files are processed, I would use rbind to get a big dataframe and save it to a RData file.

So far, I have tried to use the loops on them, but it is not giving me any outputs.

library(pasillaBamSubset)
library(Rsamtools)
filenames<-Sys.glob("*.sorted.bam")
for (file in filenames) {
  output <- pileup(pileup(filenames, index=filenames, scanBamParam = ScanBamParam(), pileupParam = PileupParam()))
  save(output, file = "res.RData")
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

贪恋 2025-01-21 23:41:39

我假设您想要将所有 data.frames 堆叠在一起（行绑定）。 map（来自 purrr）或 lapply 可以将函数应用于中的每个项目
给定的列表/向量（本例中的每个文件名）。 map_dfr 执行相同的操作，行绑定所有输出。

filenames <- list.files(pattern = "*.sorted.bam")

library(purrr)
purrr::map_dfr(filenames, ~pileup(.x, 
                                  index = .x,
                                  scanBamParam = ScanBamParam(),
                                  pileupParam = PileupParam()))

I am assuming that you want to stack all the data.frames on top of each other (row bind). map (from purrr) or lapply can apply a function to each item in
a given list/vector (each filename in this case). map_dfr does the same and row binds all the outputs.

filenames <- list.files(pattern = "*.sorted.bam")

library(purrr)
purrr::map_dfr(filenames, ~pileup(.x, 
                                  index = .x,
                                  scanBamParam = ScanBamParam(),
                                  pileupParam = PileupParam()))

回复收藏 0 原文

~没有更多了~