当前位置：文江博客话题详情

使用 python 将文件复制到文件列表中指定的目录

发布于 2024-08-01 18:10:00 字数 1421 浏览 3 评论 0原文

我在一个目录中有一堆文件，我想将它们组织在子目录中。

此目录结构（哪个文件将位于哪个目录中）在文件列表中指定，如下所示：

Directory: Music\

-> 01-some_song1.mp3

-> 02-some_song2.mp3

-> 03-some_song3.mp3

目录：Images\

-> 01-some_image1.jpg

-> 02-some_image2.jpg

................................

我正在考虑提取数据（目录名称和文件name）并将其存储在如下所示的字典中：

dictionary = {'Music': (01-some_song1.mp3, 02-some_song2.mp3,
                         03-some_song3.mp3),
              'Images': (01-some_image1.jpg, 02-some_image2.jpg),
          ......................................................
}

之后，我将复制/移动文件到各自的目录中。

对于字典值，我尝试通过执行以下操作来获取列表的列表：

def get_values(file):
    values = []
    tmp = []
    pattern = re.compile(r'^-> (.+?)$')
    for line in file:
        if line.strip().startswith('->'):
            match = re.search(pattern, line.strip())
            if match:
                tmp.append(match.group(1))
        elif line.strip().startswith('Directory'):
            values.append(tmp)
            del tmp[:]
    return values

这似乎不起作用。 values 列表中的每个列表都一遍又一遍地包含相同的 4 个文件名。

我究竟做错了什么？

我还想知道完成这一切的其他方法是什么？我确信有更好/更简单/更干净的方法。

原文

I have a bunch of files in a single directory that I would like to organize in sub-directories.

This directory structure (which file would go in which directory) is specified in a file list that looks like this:

Directory: Music\

-> 01-some_song1.mp3

-> 02-some_song2.mp3

-> 03-some_song3.mp3

Directory: Images\

-> 01-some_image1.jpg

-> 02-some_image2.jpg

......................

I was thinking of extracting the data (directory name and file name) and store it in a dictionary that would look like this:

dictionary = {'Music': (01-some_song1.mp3, 02-some_song2.mp3,
                         03-some_song3.mp3),
              'Images': (01-some_image1.jpg, 02-some_image2.jpg),
          ......................................................
}

After that I would copy/move the files in their respective directories.

I already extracted the directory names and created the empty dirs.

For the dictionary values I tried to get a list of lists by doing the following:

def get_values(file):
    values = []
    tmp = []
    pattern = re.compile(r'^-> (.+?)
This doesn't seem to work. Each list from the values list contains the same 4 file names  over and over again.
What am I doing wrong?
I would also like to know what are the other ways of doing this whole thing? I'm sure there's a better/simpler/cleaner way.
)
    for line in file:
        if line.strip().startswith('->'):
            match = re.search(pattern, line.strip())
            if match:
                tmp.append(match.group(1))
        elif line.strip().startswith('Directory'):
            values.append(tmp)
            del tmp[:]
    return values

This doesn't seem to work. Each list from the values list contains the same 4 file names over and over again.

What am I doing wrong?

I would also like to know what are the other ways of doing this whole thing? I'm sure there's a better/simpler/cleaner way.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

神回复 2024-08-08 18:10:00

我认为原因是您始终重复使用相同的列表。

del tmp[:] 清除列表并且不创建新实例。在您的情况下，您需要通过调用 tmp = [] 创建一个新列表

以下修复应该有效（我没有测试它）

def get_values(file):
    values = []
    tmp = []
    pattern = re.compile(r'^-> (.+?)
)
    for line in file:
        if line.strip().startswith('->'):
            match = re.search(pattern, line.strip())
            if match:
                tmp.append(match.group(1))
        elif line.strip().startswith('Directory'):
            values.append(tmp)
            tmp = []
    return values

I think that the cause is that you are reusing always the same list.

del tmp[:] clears the list and doesn't create a new instance. In your case, you need to create a new list by calling tmp = []

Following fix should work (I didn't test it)

def get_values(file):
    values = []
    tmp = []
    pattern = re.compile(r'^-> (.+?)
)
    for line in file:
        if line.strip().startswith('->'):
            match = re.search(pattern, line.strip())
            if match:
                tmp.append(match.group(1))
        elif line.strip().startswith('Directory'):
            values.append(tmp)
            tmp = []
    return values

回复收藏 0 原文

揽清风入怀 2024-08-08 18:10:00

不需要使用正则表达式

d = {}
for line in open("file"):
    line=line.strip()
    if line.endswith("\\"):
        directory = line.split(":")[-1].strip().replace("\\","")
        d.setdefault(directory,[])
    if line.startswith("->"):
        song=line.split(" ")[-1]
        d[directory].append(song)
print d

输出

# python python.py
{'Images': ['01-some_image1.jpg', '02-some_image2.jpg'], 'Music': ['01-some_song1.mp3', '02-some_song2.mp3', '03-some_song3.mp3']}

no need to use regular expression

d = {}
for line in open("file"):
    line=line.strip()
    if line.endswith("\\"):
        directory = line.split(":")[-1].strip().replace("\\","")
        d.setdefault(directory,[])
    if line.startswith("->"):
        song=line.split(" ")[-1]
        d[directory].append(song)
print d

output

# python python.py
{'Images': ['01-some_image1.jpg', '02-some_image2.jpg'], 'Music': ['01-some_song1.mp3', '02-some_song2.mp3', '03-some_song3.mp3']}

回复收藏 0 原文

孤者何惧 2024-08-08 18:10:00

如果您使用 collections.defaultdict(list)，您将获得一个列表，该列表的元素是列表的字典。如果找不到该键，则会添加一个空列表值，这样您就可以立即开始追加到列表中。这就是这一行的作用：

d[dir].append(match.group(1))

如果目录名不存在，它将创建该目录名作为键，并将找到的文件名附加到列表中。

顺便说一句，如果您在使正则表达式正常工作时遇到问题，请尝试使用调试标志创建它们。我不记得符号名称，但数字是 128。所以如果你这样做：

file_regex = re.compile(r'^-> (.+?)
你会得到这个额外的输出：
at at_beginning
literal 45
literal 62
literal 32
subpattern 1
  min_repeat 1 65535
    any None
at at_end

并且你可以看到有一个起始行匹配加上 '->;   ' （对于 45 62 32），然后重复任何模式和行尾匹配。   对于调试非常有用。
  代码：
from __future__ import with_statement

import re
import collections

def get_values(file):
    d = collections.defaultdict(list)
    dir = ""
    dir_regex = re.compile(r'^Directory: (.+?)\\
结果：
Images ['01-some_image1.jpg', '02-some_image2.jpg']
Music ['01-some_song1.mp3', '02-some_song2.mp3', '03-some_song3.mp3']

, 128)

你会得到这个额外的输出：

并且你可以看到有一个起始行匹配加上 '->;   ' （对于 45 62 32），然后重复任何模式和行尾匹配。   对于调试非常有用。
  代码：

结果：

)
    file_regex = re.compile(r'\-\> (.+?)
结果：

, 128)

你会得到这个额外的输出：

并且你可以看到有一个起始行匹配加上 '->; ' （对于 45 62 32），然后重复任何模式和行尾匹配。对于调试非常有用。

代码：

结果：

) with open(file) as f: for line in f: line = line.strip() match = dir_regex.search(line) if match: dir = match.group(1) else: match = file_regex.search(line) if match: d[dir].append(match.group(1)) return d if __name__ == '__main__': d = get_values('test_file') for k, v in d.items(): print k, v

结果：

, 128)

你会得到这个额外的输出：

并且你可以看到有一个起始行匹配加上 '->; ' （对于 45 62 32），然后重复任何模式和行尾匹配。对于调试非常有用。

代码：

结果：

If you use collections.defaultdict(list), you get a list that dictionary whose elements are lists. If the key is not found, it is added with a value of empty list, so you can start appending to the list immediately. That's what this line does:

d[dir].append(match.group(1))

It creates the directory name as a key if it does not exist and appends the file name found to the list.

BTW, if you are having problems getting your regexes to work, try creating them with the debug flag. I can't remember the symbolic name, but the number is 128. So if you do this:

file_regex = re.compile(r'^-> (.+?)
You get this additional output:
at at_beginning
literal 45
literal 62
literal 32
subpattern 1
  min_repeat 1 65535
    any None
at at_end

And you can see that there is a start line match plus '-> ' (for 45 62 32) and then a repeated any pattern and end of line match. Very useful for debugging.
Code:
from __future__ import with_statement

import re
import collections

def get_values(file):
    d = collections.defaultdict(list)
    dir = ""
    dir_regex = re.compile(r'^Directory: (.+?)\\
Result:
Images ['01-some_image1.jpg', '02-some_image2.jpg']
Music ['01-some_song1.mp3', '02-some_song2.mp3', '03-some_song3.mp3']

, 128)

You get this additional output:

And you can see that there is a start line match plus '-> ' (for 45 62 32) and then a repeated any pattern and end of line match. Very useful for debugging.
Code:

Result:

)
    file_regex = re.compile(r'\-\> (.+?)
Result:

, 128)

You get this additional output:

And you can see that there is a start line match plus '-> ' (for 45 62 32) and then a repeated any pattern and end of line match. Very useful for debugging.

Code:

Result:

, 128)

You get this additional output:

And you can see that there is a start line match plus '-> ' (for 45 62 32) and then a repeated any pattern and end of line match. Very useful for debugging.

Code:

Result:

回复收藏 0 原文

~没有更多了~