如何根据电子表格标准选择特定文件，然后从目录中复制到R中的另一个目录？

发布于 2025-02-12 07:56:39 字数 720 浏览 1 评论 0原文

任务要求我在存储文件名的CSV电子表格中

特定
列
我
的

使用 '目标'。

我正在使用此CSV电子表格作为人行横道，以选择具有与列“文件名”中的名称相匹配的任何文件。然后，我要求R从源文件夹中复制不仅包含这些文件，还包含此列表中未包含的其他文件（例如：CO-001，SC-001 ...）。如果有用，所有文件都是PDF，因此我们不必担心文件类型。我只需要具有名称的文件与CSV电子表格中的内容匹配。我该怎么做？

我在下面有一些示例代码，但仍未成功执行。

source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"

all.files <- list.files(path = source)

csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')

toCopy <- all.files[all.files %in% csvfile$Move]

file.copy(toCopy, target)

谢谢你！

原文

I have a task that requires me to use a specific column in a CSV spreadsheet that stores the file names, for example:

File Name
CA-001
WV-001
ma-001

My task is to move some files from folder 'source' to folder 'target'.

And I'm using this csv spreadsheet as a crosswalk to select any files with names that match with what's in the column 'File Name'. Then I'm asking R to copy from the source folder that contains not only these files but also other files that are not in this list(eg: CO-001, SC-001...). If it's helpful, all of the files are PDFs, so we don't worry about file type. I want only the files that have names match with what's in the csv spreadsheet. How can I do this?

I have some sample code below, but it still didn't execute successfully.

source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"

all.files <- list.files(path = source)

csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')

toCopy <- all.files[all.files %in% csvfile$Move]

file.copy(toCopy, target)

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

走野 2025-02-19 07:56:39

使用提供的代码，您要匹配的模式的选择将在csvfile $ file.name中。

我假设源目录可能很大。与其执行缓慢的正则表达式以匹配子字符串（虽然我们知道确切的文件名）和/或获得完整的文件列表（这也很慢），我只会在复制它们之前寻求确切想要的文件名存在：

source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"

csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')
    
# add .pdf suffix
toCopy <- paste0(csvfile$File.Name,'.pdf')

# add source directory path
toCopy <- file.path(source,  toCopy)

# optional: extract only the existing files from toCopy. You can skip this step if you're sure they exist and/or you don't mind receiving errors
toCopy <- toCopy[file.exists(toCopy)]

# make it so
file.copy(toCopy, target, overwrite = T)

我最好是始终将.pdf扩展名保留在文件名中，因此也在源CSV中。如果扩展名为.pdf，.pdf，Ett，则对病例敏感的文件系统（几乎所有Linux安装，几乎所有Linux安装，几乎所有Linux安装，很少是MACOS或Windows）会有一个问题。

With the provided code, the selection of patterns you want to match will be in csvfile$File.Name.

I'm assuming the source directory is potentially very large. Instead of performing slow regular expressions to match substrings (while we know the exact filename), and/or getting a complete file listing (which is also slow), I will only seek if the exactly wanted filenames exist before copying them:

source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"

csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')
    
# add .pdf suffix
toCopy <- paste0(csvfile$File.Name,'.pdf')

# add source directory path
toCopy <- file.path(source,  toCopy)

# optional: extract only the existing files from toCopy. You can skip this step if you're sure they exist and/or you don't mind receiving errors
toCopy <- toCopy[file.exists(toCopy)]

# make it so
file.copy(toCopy, target, overwrite = T)

I would preferably keep the .pdf extension in the filename at all times, so also in the source CSV. There would be an issue on case-sensitive filesystems (almost all Linux installations, rarely macOS or Windows) if the extension is .PDF, .Pdf, etc.

回复收藏 0 原文

~没有更多了~