其中哪一个最有效率?

发布于 2024-11-24 03:30:49 字数 3104 浏览 2 评论 0原文

问题

这些中哪一个最快?

我正在使用 lighttpd 的 mod_fastcgi、Python 2.7 和 Flup.server.fcgi.WSGIServer。

我应该直接在some_output_function中生成字符串,然后从app返回吗?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    return some_output_function()

def some_output_function():
    yield function_that_returns_a_string()
    yield 'yada yada'
    yield another_function_that_returns_a_string()

WSGIServer(app).run()

我应该some_output_function返回一个数组,然后从app返回吗?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    return some_output_function()

def some_output_function():
    out = []
    out.append(function_that_returns_a_string())
    out.append('yada yada')
    out.append(another_function_that_returns_a_string())
    return out

WSGIServer(app).run()

我应该从 some_output_function生成最后一刻连接的数组,然后从 app 返回吗?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    return some_output_function()

def some_output_function():
    out = []
    out.append(function_that_returns_a_string())
    out.append('yada yada')
    out.append(another_function_that_returns_a_string())
    yield ''.join(out)

WSGIServer(app).run()

我应该从 some_output_function 返回最后一刻连接的数组,然后从 app 中收益吗?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    yield some_output_function()

def some_output_function():
    out = []
    out.append(function_that_returns_a_string())
    out.append('yada yada')
    out.append(another_function_that_returns_a_string())
    return ''.join(out)

WSGIServer(app).run()

测试结果

通过创建一个简单的测试应用程序,输出函数具有一个函数调用,然后是十六个“yada yada”字符串,然后另一个函数调用作为输出,我使用 ApacheBench 收集了一些令人惊讶的平均请求时间。

sudo ab -n10000 -c128 localhost/testapp/
  • 44 毫秒 直接在 some_output_function 中生成字符串,然后从 app 返回
  • 44 毫秒返回一个数组>some_output_function,然后从 app 返回
  • 30 ms,从 some_output_function 返回最后一刻连接的数组,然后从 返回>应用程序
  • 30 mssome_output_function 返回最后一刻连接的数组,然后从 app 产生

更有趣的是,当增加 'yada yada' 输出的数量时字符串八倍,到 128 个 'yada yada' 输出字符串,结果如下:

  • 146 ms 直接在 some_output_function 中生成字符串,然后从 app< /代码>
  • 146 毫秒some_output_function返回一个数组,然后从app返回
  • 30毫秒以产生最后一刻的连接来自 some_output_function 的数组,然后从 app 返回
  • 30 ms 以从 some_output_function 返回最后一刻连接的数组,然后从 app

收益似乎节省时间的一个常见因素是构建一个字符串数组,然后在退出内部输出函数之前加入它,而不是到处产生。无论您在内部屈服并在外部返回,还是在内部返回并在内部屈服,似乎都没有改变任何东西。

所以现在唯一的问题是,我应该向内屈服还是向外屈服?

Question

Which of these is the quickest?

I'm using lighttpd's mod_fastcgi, Python 2.7 and flup.server.fcgi.WSGIServer.

Should I yield strings directly in some_output_function, then return from app?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    return some_output_function()

def some_output_function():
    yield function_that_returns_a_string()
    yield 'yada yada'
    yield another_function_that_returns_a_string()

WSGIServer(app).run()

Should I return an array from some_output_function, then return from app?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    return some_output_function()

def some_output_function():
    out = []
    out.append(function_that_returns_a_string())
    out.append('yada yada')
    out.append(another_function_that_returns_a_string())
    return out

WSGIServer(app).run()

Should I yield a last-minute joined array from some_output_function, then return from app?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    return some_output_function()

def some_output_function():
    out = []
    out.append(function_that_returns_a_string())
    out.append('yada yada')
    out.append(another_function_that_returns_a_string())
    yield ''.join(out)

WSGIServer(app).run()

Should I return a last-minute joined array from some_output_function, then yield from app?

def app(env, start):
    start('200 OK', [('Content-Type', 'text/html')])
    yield some_output_function()

def some_output_function():
    out = []
    out.append(function_that_returns_a_string())
    out.append('yada yada')
    out.append(another_function_that_returns_a_string())
    return ''.join(out)

WSGIServer(app).run()

Test results

By creating a simple test application, with the output function having one function call, then sixteen 'yada yada' strings, then another function call as the output, I gathered some surprising average request times, using ApacheBench.

sudo ab -n10000 -c128 localhost/testapp/
  • 44 ms to yield strings directly in some_output_function, then return from app
  • 44 ms to return an array from some_output_function, then return from app
  • 30 ms to yield a last-minute joined array from some_output_function, then return from app
  • 30 ms to return a last-minute joined array from some_output_function, then yield from app

Even more interesting, is that when increasing the number of 'yada yada' output strings eight-fold, to 128 'yada yada' output strings, these are the results:

  • 146 ms to yield strings directly in some_output_function, then return from app
  • 146 ms to return an array from some_output_function, then return from app
  • 30 ms to yield a last-minute joined array from some_output_function, then return from app
  • 30 ms to return a last-minute joined array from some_output_function, then yield from app

It appears that a common factor to save time is building a string array, then joining it just before exiting the inner output function, instead of yielding everywhere. Whether you yield inside and return outside, or return inside and yield inside, doesn't appear to change anything.

So the only question now, really, is, should I yield inside or outside?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

七分※倦醒 2024-12-01 03:30:49

一般来说,在处理大量数据时,生成器比列表更有效。如果元素数量较少(例如,在您的示例中只有三个元素),列表的开销会较小。

无论您选择哪种方法,与从缓存或数据存储中获取数据所花费的时间(几十到几百毫秒)相比,它很可能相形见绌。减少 10 毫秒的响应时间可能并不值得担心。

使用生成器的原因不是为了速度,而是因为大量响应将流式传输到客户端,这将使用更少的内存并释放服务器来处理更多请求。当使用异步服务器(例如带有 eventlet 工作程序的 Gunicorn、Tornado 等)时,这尤其有用。

回答这个问题:

所以现在唯一的问题是,我应该在内部屈服还是在外部屈服?

实际上它应该没有任何区别。

As a general rule generators are more efficient than lists when dealing with a lot of data. A list will be less overhead if the number of elements is small (e.g. in your example, only three elements).

Whichever method you choose, it will most likely be dwarfed by the time spent fetching data from the cache or the datastore (dozens to hundreds of milliseconds). Shaving 10ms of response time is probably not worth worrying about.

The reason why generators should be used is not for speed, but because large responses will be streamed to the client, which will use less memory and free up the server to process more requests. This is especially beneficial when done with an async server (e.g. gunicorn with eventlet workers, Tornado, etc.).

To answer this question:

So the only question now, really, is, should I yield inside or outside?

Practically it should not make any difference.

恋竹姑娘 2024-12-01 03:30:49

与单个大字符串相比,生成大量小字符串会更慢,因为 WSGI 规范要求底层 WSGI 服务器/适配器在写入下一个字符串之前执行每个字符串的刷新。字符串的刷新可以经过多层,具体取决于所使用的服务器,最终到达套接字上的刷新。当你多次这样做时,它会比编写一个更大的字符串更昂贵。

Producing lots of small strings, versus a single big string is slower because the WSGI specification requires that the underlying WSGI server/adapter perform a flush of each string before writing the next. A flush of a string can go through many layers depending on the server used eventually getting down to a flush on a socket. As you are doing this many times it will be more expensive that writing a single larger string.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文