python 初学者 - 如何将多个文件的内容读取到唯一列表中?
我想将多个文件中的内容读取到稍后可以调用的唯一列表中 - 最终,我想将这些列表转换为集合并对它们执行交集和减法。这肯定是一个非常幼稚的问题,但是在仔细研究了 Lutz 的“学习 Python”中的迭代器和循环部分之后,我似乎无法理解如何解决这个问题。这是我写的:
#!/usr/bin/env python
import sys
OutFileName = 'test.txt'
OutFile = open(OutFileName, 'w')
FileList = sys.argv[1: ]
Len = len(FileList)
print Len
for i in range(Len):
sys.stderr.write("Processing file %s\n" % (i))
FileNum = i
for InFileName in FileList:
InFile = open(InFileName, 'r')
PathwayList = InFile.readlines()
print PathwayList
InFile.close()
通过几个简单的测试文件,我得到如下输出:
处理文件0
处理文件1
['alg1\n'、'alg2\n'、'alg3\n'、'alg4\n'、'alg5\n'、'alg6']
['csr1\n'、'csr2\n'、'csr3\n'、'csr4\n'、'csr5\n'、'csr6\n'、'csr7\n'、'alg2\n ', 'alg6']
这些列表是正确的,但是如何将每个列表分配给一个唯一的变量,以便稍后调用它们(例如,通过包括变量名中范围内的索引#)?
非常感谢您为一个完整的编程初学者指明了正确的方向!
I'd like to read the contents from several files into unique lists that I can call later - ultimately, I want to convert these lists to sets and perform intersections and subtraction on them. This must be an incredibly naive question, but after poring over the iterators and loops sections of Lutz's "Learning Python," I can't seem to wrap my head around how to approach this. Here's what I've written:
#!/usr/bin/env python
import sys
OutFileName = 'test.txt'
OutFile = open(OutFileName, 'w')
FileList = sys.argv[1: ]
Len = len(FileList)
print Len
for i in range(Len):
sys.stderr.write("Processing file %s\n" % (i))
FileNum = i
for InFileName in FileList:
InFile = open(InFileName, 'r')
PathwayList = InFile.readlines()
print PathwayList
InFile.close()
With a couple of simple test files, I get output like this:
Processing file 0
Processing file 1
['alg1\n', 'alg2\n', 'alg3\n', 'alg4\n', 'alg5\n', 'alg6']
['csr1\n', 'csr2\n', 'csr3\n', 'csr4\n', 'csr5\n', 'csr6\n', 'csr7\n', 'alg2\n', 'alg6']
These lists are correct, but how do I assign each one to a unique variable so that I can call them later (for example, by including the index # from range in the variable name)?
Thanks so much for pointing a complete programming beginner in the right direction!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
假设您读取两个文件,以下内容将进行逐行比较(它不会在较长的文件中拾取任何额外的行,但如果一个文件的行数多于另一个文件,则它们不会相同;)
对于您想要执行的操作,您可能需要查看 difflib 模块中 Python。要进行排序,请查看可变序列类型、
someListVar.sort ()
将对 someListVar 的内容进行就地排序。Assuming you read in two files, the following will do a line by line comparison (it won't pick up any extra lines in the longer file, but then they'd not be the same if one had more lines than the other ;)
For what you're wanting to do, you might want to take a look at the difflib module in Python. For sorting, look at Mutable Sequence Types,
someListVar.sort()
will sort the contents of someListVar in place.如果您不需要记住内容的来源,您可以这样做:
或者,如果您想跟踪文件名,您可以使用字典:
You could do it like that if you don't need to remeber where the contents come from :
or, if you want to keep track of the files names, you could use a dictionary :
您可能想查看 Python 的 fileinput 模块,它是标准库的一部分,并且允许您一次处理多个文件。
You might want to check out Python's fileinput module, which is a part of the standard library and allows you to process multiple files at once.
本质上,您有一个文件列表,并且您想要更改为这些文件的行列表...
几种方法:
这将为您提供类似 -> 的结果。 [ ['alg1', 'alg2', 'alg3'], ['csr1', 'csr2'...]] 访问将类似于 'result[0]' ,这将导致 ['alg1', 'alg2' ,'alg3']...
更好的可能是字典:
如果你只想连接,你只需要链接它:
对于初学者来说不是单行...但是现在它尝试理解正在发生的事情将是一个很好的练习:)
Essentially, you have a list of files and you want to change to list of lines of these files...
Several ways:
This would get you a result like -> [ ['alg1', 'alg2', 'alg3'], ['csr1', 'csr2'...]] Accessing would be like 'result[0]' which would result in ['alg1', 'alg2', 'alg3']...
Somewhat better might be dictionary:
If you want to just concatenate, you would just need to chain it:
Not one-liners for a beginner...however now it would be a good exercies to try to comprehend what's going on :)
您需要为您正在读取的每个文件“编号”动态创建变量名称。 (我故意含糊其辞,知道如何构建这样的变量非常有价值,并且如果您自己发现的话更容易记住)
类似于 这会给你一个开始
You need to dynamically create the variable name for each file 'number' that you're reading. (I'm being deliberately vague on purpose, knowing how to build variables like this is quite valuable and more readily remembered if you discover it yourself)
something like this will give you a start
您需要一个包含 PathwayList 列表的列表,即列表的列表。
备注:使用大写变量名的情况很少见。对此没有严格的规则,但按照惯例,大多数人只使用大写的类名称。
You need a list which holds your PathwayList lists, that is a list of lists.
One remark: it is quite uncommon to use capitalized variable names. There is no strict rule for that, but by convention most people only use capitalized names for classes.