发电机功能性能

发布于 2024-11-14 16:12:18 字数 2467 浏览 2 评论 0原文

我试图了解生成器函数的性能。我使用 cProfile 和 pstats 模块来收集和检查分析数据。有问题的函数是这样的:

def __iter__(self):
    delimiter  = None
    inData     = self.inData
    lenData    = len(inData)
    cursor     = 0
    while cursor < lenData:
        if delimiter:
            mo = self.stringEnd[delimiter].search(inData[cursor:])
        else:
            mo = self.patt.match(inData[cursor:])
        if mo:
            mo_lastgroup = mo.lastgroup
            mstart       = cursor
            mend         = mo.end()
            cursor       += mend
            delimiter = (yield (mo_lastgroup, mo.group(mo_lastgroup), mstart, mend))
        else:
            raise SyntaxError("Unable to tokenize text starting with: \"%s\"" % inData[cursor:cursor+200])

self.inData是一个unicode文本字符串,self.stringEnd是一个带有4个简单正则表达式的字典,self.patt是一个大正则表达式。整个过程就是将大字符串一一拆分成较小的字符串。

对使用它的程序进行性能分析,我发现程序运行时间的最大部分花费在这个函数上:

In [800]: st.print_stats("Scanner.py:124")

         463263 function calls (448688 primitive calls) in 13.091 CPU seconds

   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10835   11.465    0.001   11.534    0.001 Scanner.py:124(__iter__)

但是查看函数本身的性能分析,在函数的子调用上花费的时间并不多:

In [799]: st.print_callees("Scanner.py:124")
   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

Function                  called...
                              ncalls  tottime  cumtime
Scanner.py:124(__iter__)  ->   10834    0.006    0.006  {built-in method end}
                               10834    0.009    0.009  {built-in method group}
                                8028    0.030    0.030  {built-in method match}
                                2806    0.025    0.025  {built-in method search}
                                   1    0.000    0.000  {len}

其余的除了while、赋值和if-else之外,功能并不多。即使我使用的生成器上的 send 方法也很快:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
13643/10835    0.007    0.000   11.552    0.001 {method 'send' of 'generator' objects}

是否有可能将值传递回消费者的 yield 占用了大部分时间? !还有什么我不知道的吗?

编辑

我可能应该提到生成器函数__iter__是一个小类的方法,因此self指的是此类的实例。

I'm trying to understand the performance of a generator function. I've used cProfile and the pstats module to collect and inspect profiling data. The function in question is this:

def __iter__(self):
    delimiter  = None
    inData     = self.inData
    lenData    = len(inData)
    cursor     = 0
    while cursor < lenData:
        if delimiter:
            mo = self.stringEnd[delimiter].search(inData[cursor:])
        else:
            mo = self.patt.match(inData[cursor:])
        if mo:
            mo_lastgroup = mo.lastgroup
            mstart       = cursor
            mend         = mo.end()
            cursor       += mend
            delimiter = (yield (mo_lastgroup, mo.group(mo_lastgroup), mstart, mend))
        else:
            raise SyntaxError("Unable to tokenize text starting with: \"%s\"" % inData[cursor:cursor+200])

self.inData is a unicode text string, self.stringEnd is a dict with 4 simple regex's, self.patt is one big regex. The whole thing is to split the big string into smaller strings, one-by-one.

Profiling a program that uses it I found that the biggest part of the program's run time is spent in this function:

In [800]: st.print_stats("Scanner.py:124")

         463263 function calls (448688 primitive calls) in 13.091 CPU seconds

   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10835   11.465    0.001   11.534    0.001 Scanner.py:124(__iter__)

But looking at the profile of the function itself, there is not much time spent in the sub-calls of functions:

In [799]: st.print_callees("Scanner.py:124")
   Ordered by: cumulative time
   List reduced from 231 to 1 due to restriction <'Scanner.py:124'>

Function                  called...
                              ncalls  tottime  cumtime
Scanner.py:124(__iter__)  ->   10834    0.006    0.006  {built-in method end}
                               10834    0.009    0.009  {built-in method group}
                                8028    0.030    0.030  {built-in method match}
                                2806    0.025    0.025  {built-in method search}
                                   1    0.000    0.000  {len}

The rest of the function is not much besides while, assignments and if-else. Even the send method on the generator which I use is fast:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
13643/10835    0.007    0.000   11.552    0.001 {method 'send' of 'generator' objects}

Is it possible that the yield, passing a value back to the consumer, is taking the majority of the time?! Anything else that I'm not aware of?

EDIT:

I probably should have mentioned that the generator function __iter__ is a method of a small class, so self refers to an instance of this class.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

—━☆沉默づ 2024-11-21 16:12:18

这实际上是 Dunes 的答案,不幸的是,他只是将其作为评论给出,并且似乎并不倾向于将其放在正确的答案。

性能的主要罪魁祸首是字符串切片。一些计时测量表明,大切片(意味着从已经很大的字符串中取出大切片)的切片性能明显下降。为了解决这个问题,我现在使用正则表达式对象方法的 pos 参数:

    if delimiter:
        mo = self.stringEnd[delimiter].search(inData, pos=cursor)
    else:
        mo = self.patt.match(inData, pos=cursor)

感谢 所有提供帮助的人

This is actually the answer of Dunes, who unfortunately only gave it as a comment and doesn't seem to be inclined to put it in a proper answer.

The main performance culprit were the string slices. Some timing measurements showed that slicing performance degrades perceivably with big slices (meaning taking a big slice from an already big string). To work around that I now use the pos parameter for the regex object methods:

    if delimiter:
        mo = self.stringEnd[delimiter].search(inData, pos=cursor)
    else:
        mo = self.patt.match(inData, pos=cursor)

Thanks to all who helped.

如梦亦如幻 2024-11-21 16:12:18

如果正确读取示例,则您将获取一个生成器对象,将其放入 delimiter 中,并将其用于数组查找。这可能不是你的速度问题,但我很确定这是一个错误。

If reading your sample correctly, you are taking a generator object, putting it into delimiter, and using it for an array lookup. That may not be your speed issue, but I'm pretty sure that's a bug.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文