避免Python代码中的代码重复

发布于 2024-11-04 19:55:22 字数 2327 浏览 0 评论 0原文

考虑以下 Python 代码片段:

af=open("a",'r')
bf=open("b", 'w')

for i, line in enumerate(af):
    if i < K:
        bf.write(line)

现在,假设我要处理 KNone 的情况, 所以写入会继续到文件末尾。 我目前正在做

if K is None:
    for i, line in enumerate(af):
        bf.write(line)
else:
    for i, line in enumerate(af):            
        bf.write(line)
        if i==K:
            break

这显然不是处理这个问题的最佳方法,因为我正在复制代码。 有没有一些更综合的方法可以处理这个问题?自然的事情是 仅当 K 不为 None 时才出现 if/break 代码, 但这涉及到像 Lisp 宏一样即时编写语法, Python 确实无法做到这一点。需要澄清的是,我并不关心特定的 案例(我选择它的部分原因是它的简单性),就像学习一般知识一样 我可能不熟悉技术。

更新:阅读人们发布的答案并进行更多实验后,这里有更多评论。

如上所述,我一直在寻找可推广的通用技术,我认为 @Paul 的答案,即使用 iterrools 中的 takewhile,最适合。作为奖励,它也比我上面列出的简单方法快得多;我不知道为什么。我不太熟悉 itertools,尽管我已经看过几次了。从我的角度来看,这是函数式编程的一个案例For The Win! (有趣的是,itertools 的作者曾经询问有关删除 takewhile 的反馈。请参阅以 http://mail.python.org/pipermail/python-list/2007-December/522529.html。)我在上面简化了我的情况,实际情况有点多混乱 - 我正在循环中写入两个不同的文件。所以代码看起来更像是:

for i, line in enumerate(af):
    if i < K:
        bf.write(line)
        cf.write(line.split(',')[0].strip('"')+'\n')

鉴于我发布的示例,@Jeff 合理地建议,在 KNone 的情况下,我只需复制文件。由于实际上我无论如何都会循环,所以这样做并不是一个明确的选择。然而,takewhile 可以轻松地推广到这种情况。我还有另一个这里没有提到的用例,并且也能够在那里使用 takewhile ,这很好。第二个示例看起来(逐字)

i=0
for line in takewhile(illuminacond, af):
    line_split=line.split(',')
    pid=line_split[1][0:3]
    out = line_split[1] + ',' + line_split[2] + ',' + line_split[3][1] + line_split[3][3] + ',' \
                        + line_split[15] + ',' + line_split[9] + ',' + line_split[10]
    if pid!='cnv' and pid!='hCV' and pid!='cnv':
        i = i+1
        of.write(out.strip('"')+'\n')
        tf.write(line)

在这里我能够使用

if K is None:
    illuminacond = lambda x: x.split(',')[0] != '[Controls]'
else:
    illuminacond = lambda x: x.split(',')[0] != '[Controls]' and i < K

@Paul 原始示例的条件。然而,尽管代码有效,但我对从外部作用域获取 i 的事实并不完全满意。有更好的方法吗?或者也许这应该是一个单独的问题。不管怎样,感谢所有回答我问题的人。值得一提的是@Jeff,他提出了一些很好的建议。

Consider the following Python snippet:

af=open("a",'r')
bf=open("b", 'w')

for i, line in enumerate(af):
    if i < K:
        bf.write(line)

Now, suppose I want to handle the case where K is None,
so the writing continues to the end of the file.
I'm currently doing

if K is None:
    for i, line in enumerate(af):
        bf.write(line)
else:
    for i, line in enumerate(af):            
        bf.write(line)
        if i==K:
            break

This clearly isn't the best way to handle this, as I'm duplicating the code.
Is there some more integrated way I can handle this? The natural thing would be
to have the if/break code only be present if K is not None,
but this involves writing syntax on the fly a la Lisp macros,
which Python can't really do. Just to be clear, I'm not concerned about the particular
case (which I choose partly for its simplicity), so much as learning about general
techniques I may not be familar with.

UPDATE: After reading answers people have posted, and doing more experimentation, here are some more comments.

As said above, I was looking for general techniques that would be generalizable, and I think @Paul's answer,namely using takewhile from iterrools, fits that best. As a bonus, it is also much faster than the naive method i listed above; I'm not sure why. I'm not really familar with itertools, though I've looked at it a few times. From my perspective this is a case of functional programming For The Win! (Amusingly, the author of itertools once asked for feedback about dropping takewhile. See the thread beginning http://mail.python.org/pipermail/python-list/2007-December/522529.html.) I'd simplified my situation above, the actual situation is a bit more messy - I'm writing to two different files in the loop. So the code looks more like:

for i, line in enumerate(af):
    if i < K:
        bf.write(line)
        cf.write(line.split(',')[0].strip('"')+'\n')

Given my posted example, @Jeff reasonably suggested that in the case when K was None, I just copy the file. Since in practice I am looping anyway, doing so is not such a clear choice. However, takewhile generalizes painlessly to this case. I also had another use case I did not mention here, and was able to use takewhile there too, which was nice. The second example looks like (verbatim)

i=0
for line in takewhile(illuminacond, af):
    line_split=line.split(',')
    pid=line_split[1][0:3]
    out = line_split[1] + ',' + line_split[2] + ',' + line_split[3][1] + line_split[3][3] + ',' \
                        + line_split[15] + ',' + line_split[9] + ',' + line_split[10]
    if pid!='cnv' and pid!='hCV' and pid!='cnv':
        i = i+1
        of.write(out.strip('"')+'\n')
        tf.write(line)

here I was able to use the condition

if K is None:
    illuminacond = lambda x: x.split(',')[0] != '[Controls]'
else:
    illuminacond = lambda x: x.split(',')[0] != '[Controls]' and i < K

per @Paul's original example. However, I'm not completely happy about the fact that I'm getting i from the outer scope, though the code works. Is there a better way of doing this? Or maybe it should be a separate question. Anyway, thanks to everyone who answered my question. Honorable mention to @Jeff, who made some nice suggestions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

青柠芒果 2024-11-11 19:55:28

如果一定要循环的话,这个怎么样?

from sys import maxint

limit = K or maxint
for i, line in enumerate(af):
    if i >= limit: break
    bf.write(line)

甚至这个?

from itertools import islice
from sys import maxint

bf.writelines(islice(af, K or maxint))

为什么在 KNone 的情况下还要循环?

from shutil import copyfile

aname = 'a' bname = 'b' if K is None: copyfile(aname, bname) else: af = open(aname, 'r') bf = open(bname, 'w') for i, line in enumerate(af): if i < K: bf.write(line)

If you must loop, how about this?

from sys import maxint

limit = K or maxint
for i, line in enumerate(af):
    if i >= limit: break
    bf.write(line)

Or even this?

from itertools import islice
from sys import maxint

bf.writelines(islice(af, K or maxint))

Why loop at all in the case that K is None?

from shutil import copyfile

aname = 'a' bname = 'b' if K is None: copyfile(aname, bname) else: af = open(aname, 'r') bf = open(bname, 'w') for i, line in enumerate(af): if i < K: bf.write(line)

淡笑忘祈一世凡恋 2024-11-11 19:55:28

我认为您面临的情况是,您必须接受 DRY 原则和优化之间的权衡。

我首先会坚持 DRY 原则,并使用诸如 write_until 之类的函数删除重复的代码...

def write_until(file_in,file_out,break_on)
    for i,line in enumerate(file_in)

        if break_on(i,line):
            break
        else:
            file_out.write(line)

af=open("a",'r')
bf=open("b", 'w')

if K is None:
    write_until(af,bf,lambda i,line: False)
else:
    write_until(af,bf,lambda i,line: i>K)

然后实际使用这些代码,看看您是否真的需要进行优化。老实说,通过删除 if False 检查,您会看到多少性能改进?如果您确实需要额外的速度提升(我对此表示怀疑),那么您将不得不忍受一些代码重复。

I think you're in a situation where you are going to have to accept a trade off between DRY principles and optimizations.

I would start by staying true to DRY principles and remove the duplicate code with a function like write_until...

def write_until(file_in,file_out,break_on)
    for i,line in enumerate(file_in)

        if break_on(i,line):
            break
        else:
            file_out.write(line)

af=open("a",'r')
bf=open("b", 'w')

if K is None:
    write_until(af,bf,lambda i,line: False)
else:
    write_until(af,bf,lambda i,line: i>K)

Then actually use the code and see if you really need to do optimizations. How much performance improvement will you honestly see from removing an if False check? If you really need that extra speed boost (which I doubt) then you'll just have to live with some code duplication.

巴黎盛开的樱花 2024-11-11 19:55:27

itertools.takewhile 将应用您的条件,然后在条件第一次失败时跳出循环。

from itertools import takewhile

if K is None:
    condition = lambda x: True
else:
    condition = lambda x: x[0] < K

for i,line in takewhile(condition, enumerate(af)):
    bf.write(line)

如果 K 为 None,那么您不希望 takewhile 停止,因此条件函数应始终返回 True。但是,如果给定 K 的数值,那么一旦元组的第 0 个元素传递给条件 >= K,takewhile 将停止。

itertools.takewhile will apply your condition, and then break out of the loop the first time the condition fails.

from itertools import takewhile

if K is None:
    condition = lambda x: True
else:
    condition = lambda x: x[0] < K

for i,line in takewhile(condition, enumerate(af)):
    bf.write(line)

If K is None, then you don't want takewhile to ever stop, so the condition function should always return True. But if you are given a numeric value for K, then once the 0'th element of the tuple passed to the condition >= K, then takewhile will stop.

鱼忆七猫命九 2024-11-11 19:55:27

无论 K 是多少,它总是小于无穷大。

if K is None:
    K = float('inf') # infinity

for i, line in enumerate(af):            
    bf.write(line)
    if i==K:
        break

或者,设置 K = -1 也同样有效,尽管它在语义上不太正确。理想情况下,您可以在 af 中设置 K = max 行,但我认为数据并不便宜。

Whatever K is, it's always going to be less than infinity.

if K is None:
    K = float('inf') # infinity

for i, line in enumerate(af):            
    bf.write(line)
    if i==K:
        break

Or, setting K = -1 works just as well, though it's less semantically correct. Ideally you would set K = max lines in af, but I presume that data is not cheaply available.

近箐 2024-11-11 19:55:24
for i, line in enumerate(af):  
    if K is None or i < K:
        bf.write(line)
    else:
        break
for i, line in enumerate(af):  
    if K is None or i < K:
        bf.write(line)
    else:
        break
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文