发电机功能性能

发布于 2024-11-14 16:12:18 字数 2467 浏览 2 评论 0原文

我试图了解生成器函数的性能。我使用 cProfile 和 pstats 模块来收集和检查分析数据。有问题的函数是这样的：

def __iter__(self):
    delimiter  = None
    inData     = self.inData
    lenData    = len(inData)
    cursor     = 0
    while cursor < lenData:
        if delimiter:
            mo = self.stringEnd[delimiter].search(inData[cursor:])
        else:
            mo = self.patt.match(inData[cursor:])
        if mo:
            mo_lastgroup = mo.lastgroup
            mstart       = cursor
            mend         = mo.end()
            cursor       += mend
            delimiter = (yield (mo_lastgroup, mo.group(mo_lastgroup), mstart, mend))
        else:
            raise SyntaxError("Unable to tokenize text starting with: \"%s\"" % inData[cursor:cursor+200])

self.inData是一个unicode文本字符串，self.stringEnd是一个带有4个简单正则表达式的字典，self.patt是一个大正则表达式。整个过程就是将大字符串一一拆分成较小的字符串。

对使用它的程序进行性能分析，我发现程序运行时间的最大部分花费在这个函数上：

In [800]: st.print_stats("Scanner.py:124")

         463263 function calls (448688 primitive calls) in 13.091 CPU seconds

   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10835   11.465    0.001   11.534    0.001 Scanner.py:124(__iter__)

但是查看函数本身的性能分析，在函数的子调用上花费的时间并不多：

In [799]: st.print_callees("Scanner.py:124")
   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

Function                  called...
                              ncalls  tottime  cumtime
Scanner.py:124(__iter__)  ->   10834    0.006    0.006  {built-in method end}
                               10834    0.009    0.009  {built-in method group}
                                8028    0.030    0.030  {built-in method match}
                                2806    0.025    0.025  {built-in method search}
                                   1    0.000    0.000  {len}

其余的除了while、赋值和if-else之外，功能并不多。即使我使用的生成器上的 send 方法也很快：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
13643/10835    0.007    0.000   11.552    0.001 {method 'send' of 'generator' objects}

是否有可能将值传递回消费者的 yield 占用了大部分时间？！还有什么我不知道的吗？

编辑：

我可能应该提到生成器函数__iter__是一个小类的方法，因此self指的是此类的实例。

原文

I'm trying to understand the performance of a generator function. I've used cProfile and the pstats module to collect and inspect profiling data. The function in question is this:

def __iter__(self):
    delimiter  = None
    inData     = self.inData
    lenData    = len(inData)
    cursor     = 0
    while cursor < lenData:
        if delimiter:
            mo = self.stringEnd[delimiter].search(inData[cursor:])
        else:
            mo = self.patt.match(inData[cursor:])
        if mo:
            mo_lastgroup = mo.lastgroup
            mstart       = cursor
            mend         = mo.end()
            cursor       += mend
            delimiter = (yield (mo_lastgroup, mo.group(mo_lastgroup), mstart, mend))
        else:
            raise SyntaxError("Unable to tokenize text starting with: \"%s\"" % inData[cursor:cursor+200])

self.inData is a unicode text string, self.stringEnd is a dict with 4 simple regex's, self.patt is one big regex. The whole thing is to split the big string into smaller strings, one-by-one.

Profiling a program that uses it I found that the biggest part of the program's run time is spent in this function:

In [800]: st.print_stats("Scanner.py:124")

         463263 function calls (448688 primitive calls) in 13.091 CPU seconds

   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10835   11.465    0.001   11.534    0.001 Scanner.py:124(__iter__)

But looking at the profile of the function itself, there is not much time spent in the sub-calls of functions:

In [799]: st.print_callees("Scanner.py:124")
   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

Function                  called...
                              ncalls  tottime  cumtime
Scanner.py:124(__iter__)  ->   10834    0.006    0.006  {built-in method end}
                               10834    0.009    0.009  {built-in method group}
                                8028    0.030    0.030  {built-in method match}
                                2806    0.025    0.025  {built-in method search}
                                   1    0.000    0.000  {len}

The rest of the function is not much besides while, assignments and if-else. Even the send method on the generator which I use is fast:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
13643/10835    0.007    0.000   11.552    0.001 {method 'send' of 'generator' objects}

Is it possible that the yield, passing a value back to the consumer, is taking the majority of the time?! Anything else that I'm not aware of?

EDIT:

I probably should have mentioned that the generator function __iter__ is a method of a small class, so self refers to an instance of this class.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

—━☆沉默づ 2024-11-21 16:12:18

这实际上是 Dunes 的答案，不幸的是，他只是将其作为评论给出，并且似乎并不倾向于将其放在正确的答案。

性能的主要罪魁祸首是字符串切片。一些计时测量表明，大切片（意味着从已经很大的字符串中取出大切片）的切片性能明显下降。为了解决这个问题，我现在使用正则表达式对象方法的 pos 参数：

    if delimiter:
        mo = self.stringEnd[delimiter].search(inData, pos=cursor)
    else:
        mo = self.patt.match(inData, pos=cursor)

感谢所有提供帮助的人。

This is actually the answer of Dunes, who unfortunately only gave it as a comment and doesn't seem to be inclined to put it in a proper answer.

The main performance culprit were the string slices. Some timing measurements showed that slicing performance degrades perceivably with big slices (meaning taking a big slice from an already big string). To work around that I now use the pos parameter for the regex object methods:

    if delimiter:
        mo = self.stringEnd[delimiter].search(inData, pos=cursor)
    else:
        mo = self.patt.match(inData, pos=cursor)

Thanks to all who helped.

回复收藏 0 原文

如梦亦如幻 2024-11-21 16:12:18

如果正确读取示例，则您将获取一个生成器对象，将其放入 delimiter 中，并将其用于数组查找。这可能不是你的速度问题，但我很确定这是一个错误。

回复收藏 0 原文

~没有更多了~

关于作者

星星的軌跡

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

发电机功能性能

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

已经忘了多久

15867725375

LonelySnow

走过海棠暮

轻许诺言

信馬由缰

友情链接

发电机功能性能

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

已经忘了多久

15867725375

LonelySnow

走过海棠暮

轻许诺言

信馬由缰

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。