递归函数的产量

发布于 2024-11-25 04:16:03 字数 439 浏览 9 评论 0原文

我正在尝试对给定路径下的所有文件执行某些操作。我不想事先收集所有文件名然后对它们执行某些操作,所以我尝试了以下方法:

import os
import stat

def explore(p):
  s = ''
  list = os.listdir(p)
  for a in list:
    path = p + '/' + a
    stat_info = os.lstat(path )
    if stat.S_ISDIR(stat_info.st_mode):
     explore(path)
    else:
      yield path

if __name__ == "__main__":
  for x in explore('.'):
    print '-->', x

但是此代码在遇到目录时会跳过目录,而不是生成它们的内容。我做错了什么?

I am trying to do something to all the files under a given path. I don't want to collect all the file names beforehand then do something with them, so I tried this:

import os
import stat

def explore(p):
  s = ''
  list = os.listdir(p)
  for a in list:
    path = p + '/' + a
    stat_info = os.lstat(path )
    if stat.S_ISDIR(stat_info.st_mode):
     explore(path)
    else:
      yield path

if __name__ == "__main__":
  for x in explore('.'):
    print '-->', x

But this code skips over directories when it hits them, instead of yielding their contents. What am I doing wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

殊姿 2024-12-02 04:16:03

迭代器不会像这样递归地工作。您必须重新生成每个结果,方法是替换

explore(path)

for value in explore(path):
    yield value

Python 3.3 添加了语法 yield正如 PEP 380 中所提议的,来自 X 来实现此目的。有了它,你可以这样做:

yield from explore(path)

如果你使用 生成器作为协程,此语法还支持使用 generator.send() 传递值回到递归调用的生成器。上面的简单 for 循环则不会。

Iterators do not work recursively like that. You have to re-yield each result, by replacing

explore(path)

with something like

for value in explore(path):
    yield value

Python 3.3 added the syntax yield from X, as proposed in PEP 380, to serve this purpose. With it you can do this instead:

yield from explore(path)

If you're using generators as coroutines, this syntax also supports the use of generator.send() to pass values back into the recursively-invoked generators. The simple for loop above would not.

偏爱你一生 2024-12-02 04:16:03

问题是这行代码:

explore(path)

它做了什么?

  • 使用新的 path 调用 explore
  • explore 运行,创建一个生成器,
  • 生成器返回到 explore(path) 的位置代码> 已执行。 。 .
  • 并且被丢弃

为什么会被丢弃?它没有被分配给任何东西,也没有被迭代——它被完全忽略了。

如果你想对结果做点什么,那么你就必须对结果做点什么! ;)

修复代码的最简单方法是:

for name in explore(path):
    yield name

当您确信自己了解正在发生的情况时,您可能需要使用 os.walk() 来代替。

迁移到 Python 3.3 后(假设一切按计划进行),您将能够使用新的 yield from 语法,此时修复代码的最简单方法是:

yield from explore(path)

The problem is this line of code:

explore(path)

What does it do?

  • calls explore with the new path
  • explore runs, creating a generator
  • the generator is returned to the spot where explore(path) was executed . . .
  • and is discarded

Why is it discarded? It wasn't assigned to anything, it wasn't iterated over -- it was completely ignored.

If you want to do something with the results, well, you have to do something with them! ;)

The easiest way to fix your code is:

for name in explore(path):
    yield name

When you are confident you understand what's going on, you'll probably want to use os.walk() instead.

Once you have migrated to Python 3.3 (assuming all works out as planned) you will be able to use the new yield from syntax and the easiest way to fix your code at that point will be:

yield from explore(path)
时光瘦了 2024-12-02 04:16:03

使用 os.walk 而不是重新发明车轮。

特别是,按照库文档中的示例,以下是未经测试的尝试:

import os
from os.path import join

def hellothere(somepath):
    for root, dirs, files in os.walk(somepath):
        for curfile in files:
            yield join(root, curfile)


# call and get full list of results:
allfiles = [ x for x in hellothere("...") ]

# iterate over results lazily:
for x in hellothere("..."):
    print x

Use os.walk instead of reinventing the wheel.

In particular, following the examples in the library documentation, here is an untested attempt:

import os
from os.path import join

def hellothere(somepath):
    for root, dirs, files in os.walk(somepath):
        for curfile in files:
            yield join(root, curfile)


# call and get full list of results:
allfiles = [ x for x in hellothere("...") ]

# iterate over results lazily:
for x in hellothere("..."):
    print x
不再见 2024-12-02 04:16:03

将其更改

explore(path)

为:

for subpath in explore(path):
    yield subpath

或使用 os.walk,如 phooji 建议的那样(这是更好的选择)。

Change this:

explore(path)

To this:

for subpath in explore(path):
    yield subpath

Or use os.walk, as phooji suggested (which is the better option).

情未る 2024-12-02 04:16:03

这就像函数一样调用 explore 。您应该做的是像生成器一样迭代它:

if stat.S_ISDIR(stat_info.st_mode):
  for p in explore(path):
    yield p
else:
  yield path

编辑:您可以使用 os.path.isdir(path) 代替 stat 模块。

That calls explore like a function. What you should do is iterate it like a generator:

if stat.S_ISDIR(stat_info.st_mode):
  for p in explore(path):
    yield p
else:
  yield path

EDIT: Instead of the stat module, you could use os.path.isdir(path).

世界等同你 2024-12-02 04:16:03

试试这个:

if stat.S_ISDIR(stat_info.st_mode):
    for p in explore(path):
        yield p

Try this:

if stat.S_ISDIR(stat_info.st_mode):
    for p in explore(path):
        yield p
半衾梦 2024-12-02 04:16:03

如果您需要遍历所有文件夹和子文件夹,那么 os.walk 非常有用。如果你不需要这个,就像用大象枪打死苍蝇一样。

然而,对于这种特定情况,os.walk 可能是更好的方法。

os.walk is great if you need to traverse all the folders and subfolders. If you don't need that, it's like using an elephant gun to kill a fly.

However, for this specific case, os.walk could be a better approach.

ぽ尐不点ル 2024-12-02 04:16:03

您还可以使用堆栈来实现递归。

但这样做实际上并没有任何好处,除了它是可能的这一事实。如果您一开始就使用 python,那么性能提升可能不值得。

import os
import stat

def explore(p):
    '''
    perform a depth first search and yield the path elements in dfs order
        -implement the recursion using a stack because a python can't yield within a nested function call
    '''
    list_t=type(list())
    st=[[p,0]]
    while len(st)>0:
        x=st[-1][0]
        print x
        i=st[-1][1]

        if type(x)==list_t:
            if i>=len(x):
                st.pop(-1)
            else:
                st[-1][1]+=1
                st.append([x[i],0])
        else:
            st.pop(-1)
            stat_info = os.lstat(x)
            if stat.S_ISDIR(stat_info.st_mode):
                st.append([['%s/%s'%(x,a) for a in os.listdir(x)],0])
            else:
                yield x

print list(explore('.'))

You can also implement the recursion using a stack.

There is not really any advantage in doing this though, other than the fact that it is possible. If you are using python in the first place, the performance gains are probably not worthwhile.

import os
import stat

def explore(p):
    '''
    perform a depth first search and yield the path elements in dfs order
        -implement the recursion using a stack because a python can't yield within a nested function call
    '''
    list_t=type(list())
    st=[[p,0]]
    while len(st)>0:
        x=st[-1][0]
        print x
        i=st[-1][1]

        if type(x)==list_t:
            if i>=len(x):
                st.pop(-1)
            else:
                st[-1][1]+=1
                st.append([x[i],0])
        else:
            st.pop(-1)
            stat_info = os.lstat(x)
            if stat.S_ISDIR(stat_info.st_mode):
                st.append([['%s/%s'%(x,a) for a in os.listdir(x)],0])
            else:
                yield x

print list(explore('.'))
如梦亦如幻 2024-12-02 04:16:03

要回答最初提出的问题,关键是 yield 语句需要从递归中传播回来(就像 return 一样)。这是 os.walk() 的工作重新实现。我在伪 VFS 实现中使用它,其中我还替换了 os.listdir() 和类似的调用。

import os, os.path
def walk (top, topdown=False):
    items = ([], [])
    for name in os.listdir(top):
        isdir = os.path.isdir(os.path.join(top, name))
        items[isdir].append(name)
    result = (top, items[True], items[False])
    if topdown:
        yield result
    for folder in items[True]:
        for item in walk(os.path.join(top, folder), topdown=topdown):
            yield item
    if not topdown:
        yield result

To answer the original question as asked, the key is that the yield statement needs to be propagated back out of the recursion (just like, say, return). Here is a working reimplementation of os.walk(). I'm using this in a pseudo-VFS implementation, where I additionally replace os.listdir() and similar calls.

import os, os.path
def walk (top, topdown=False):
    items = ([], [])
    for name in os.listdir(top):
        isdir = os.path.isdir(os.path.join(top, name))
        items[isdir].append(name)
    result = (top, items[True], items[False])
    if topdown:
        yield result
    for folder in items[True]:
        for item in walk(os.path.join(top, folder), topdown=topdown):
            yield item
    if not topdown:
        yield result
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文