如何根据电子表格标准选择特定文件,然后从目录中复制到R中的另一个目录?
任务要求我在存储文件名的CSV电子表格中
特定 |
---|
列 |
我 |
的 |
使用 '目标'。
我正在使用此CSV电子表格作为人行横道,以选择具有与列“文件名”中的名称相匹配的任何文件。然后,我要求R从源文件夹中复制不仅包含这些文件,还包含此列表中未包含的其他文件(例如:CO-001,SC-001 ...)。如果有用,所有文件都是PDF,因此我们不必担心文件类型。我只需要具有名称的文件与CSV电子表格中的内容匹配。我该怎么做?
我在下面有一些示例代码,但仍未成功执行。
source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"
all.files <- list.files(path = source)
csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')
toCopy <- all.files[all.files %in% csvfile$Move]
file.copy(toCopy, target)
谢谢你!
I have a task that requires me to use a specific column in a CSV spreadsheet that stores the file names, for example:
File Name |
---|
CA-001 |
WV-001 |
ma-001 |
My task is to move some files from folder 'source' to folder 'target'.
And I'm using this csv spreadsheet as a crosswalk to select any files with names that match with what's in the column 'File Name'. Then I'm asking R to copy from the source folder that contains not only these files but also other files that are not in this list(eg: CO-001, SC-001...). If it's helpful, all of the files are PDFs, so we don't worry about file type. I want only the files that have names match with what's in the csv spreadsheet. How can I do this?
I have some sample code below, but it still didn't execute successfully.
source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"
all.files <- list.files(path = source)
csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')
toCopy <- all.files[all.files %in% csvfile$Move]
file.copy(toCopy, target)
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用提供的代码,您要匹配的模式的选择将在
csvfile $ file.name
中。我假设源目录可能很大。与其执行缓慢的正则表达式以匹配子字符串(虽然我们知道确切的文件名)和/或获得完整的文件列表(这也很慢),我只会在复制它们之前寻求确切想要的文件名存在:
我最好是始终将
.pdf
扩展名保留在文件名中,因此也在源CSV中。如果扩展名为.pdf,.pdf,Ett,则对病例敏感的文件系统(几乎所有Linux安装,几乎所有Linux安装,几乎所有Linux安装,很少是MACOS或Windows)会有一个问题。With the provided code, the selection of patterns you want to match will be in
csvfile$File.Name
.I'm assuming the source directory is potentially very large. Instead of performing slow regular expressions to match substrings (while we know the exact filename), and/or getting a complete file listing (which is also slow), I will only seek if the exactly wanted filenames exist before copying them:
I would preferably keep the
.pdf
extension in the filename at all times, so also in the source CSV. There would be an issue on case-sensitive filesystems (almost all Linux installations, rarely macOS or Windows) if the extension is .PDF, .Pdf, etc.