Python循环读取并解析目录中的所有内容
class __init__:
path = "articles/"
files = os.listdir(path)
files.reverse()
def iterate(Files, Path):
def handleXml(content):
months = ['', 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
parse = re.compile('<(.*?)>(.*?)<(.*?)>').findall(content)
day = parse[1][1]
month = months[int(parse[2][1])]
dayN = parse[3][1]
year = parse[4][1]
hour = parse[5][1]
min = parse[6][1]
amPM = parse[7][1]
title = parse[9][1]
author = parse[10][1]
article = parse[11][1]
category = parse[12][1]
if len(Files) > 5:
del Files[5:]
for file in Files:
file = "%s%s" % (Path, file)
f = open(file, 'r')
handleXml(f.read())
f.close()
iterate(files, path)
它在启动时运行,如果我检查文件数组,它包含所有文件名。 但是当我循环浏览它们时,它们不起作用,只显示第一个。 如果我返回文件,我只会得到前两个,如果我返回解析,即使是重复的文件,它也不相同。 这些都没有任何意义。
我正在尝试使用 Python 制作一个简单的博客,因为我的服务器有一个非常旧版本的 Python,所以我无法使用像 glob 这样的模块,一切都需要尽可能基本。
files 数组包含目录中的所有文件,这对我来说已经足够了。 我不需要浏览文章目录中的其他目录。
但是当我尝试输出解析时,即使在重复的文件上我也会得到不同的结果。
谢谢,
- 汤姆
class __init__:
path = "articles/"
files = os.listdir(path)
files.reverse()
def iterate(Files, Path):
def handleXml(content):
months = ['', 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
parse = re.compile('<(.*?)>(.*?)<(.*?)>').findall(content)
day = parse[1][1]
month = months[int(parse[2][1])]
dayN = parse[3][1]
year = parse[4][1]
hour = parse[5][1]
min = parse[6][1]
amPM = parse[7][1]
title = parse[9][1]
author = parse[10][1]
article = parse[11][1]
category = parse[12][1]
if len(Files) > 5:
del Files[5:]
for file in Files:
file = "%s%s" % (Path, file)
f = open(file, 'r')
handleXml(f.read())
f.close()
iterate(files, path)
It runs on start, and if I check the files array it contains all the file names.
But when I loop through them they just do not work, only displays the first one.
If I return file I only get the first two, and if I return parse even on duplicate files it is not identical.
None of this makes any sense.
I am trying to make a simple blog using Python, and because my server has a very old version of Python I cannot use modules like glob, everything needs to be as basic as possible.
The files array contains all the files in the directory, which is good enough for me. I do not need to go through other directories inside the articles directory.
But when I try to output parse, even on duplicate files I get different results.
Thanks,
- Tom
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
可能是因为:
它也从原始列表中删除了最后 5 个条目。 除了使用
del
之外,您还可以尝试:Could it be because of:
It deletes the last 5 entries from the original list as well. Instead of using
del
, you can try:正如评论中所述,缺少实际的递归。
即使它存在于代码的其他地方,递归调用也是典型的错误地方,因此我建议您仔细检查它。
但是,为什么不使用 os.walk 呢? 它迭代所有路径,而不需要重新发明(递归)轮子。 不过它已经在2.3中引入了,我不知道你的python有多大了。
As stated in the comments, the actual recursion is missing.
Even if it is there in some other place of the code, the recursion call is the typical place where the things are wrong, and for this reason I would suggest you to double check it.
However, why don't you use os.walk? It iterates through all the path, without the need of reinventing the (recursive) wheel. It has been introduced in 2.3, though, and I do not know how old your python is.