Paste.httpserver 并通过 HTTP/1.1 Keep-alive 减慢速度;使用 httperf 和 ab 进行测试
我有一个基于paste.httpserver 的Web 服务器作为HTTP 和WSGI 之间的适配器。当我使用 httperf 进行性能测量时,如果每次使用 --num-conn 启动一个新请求,我每秒可以执行超过 1,000 个请求。如果我使用 --num-call 重用连接,那么我每秒会收到大约 11 个请求,即速度的 1/100。
如果我尝试 ab 我会超时。
我的测试是
% ./httperf --server localhost --port 8080 --num-conn 100
...
Request rate: 1320.4 req/s (0.8 ms/req)
...
,
% ./httperf --server localhost --port 8080 --num-call 100
...
Request rate: 11.2 req/s (89.4 ms/req)
...
这是一个简单的可重现服务器,
from paste import httpserver
def echo_app(environ, start_response):
n = 10000
start_response("200 Ok", [("Content-Type", "text/plain"),
("Content-Length", str(n))])
return ["*" * n]
httpserver.serve(echo_app, protocol_version="HTTP/1.1")
它是一个多线程服务器,很难分析。这是一个单线程的变体:
from paste import httpserver
class MyHandler(httpserver.WSGIHandler):
sys_version = None
server_version = "MyServer/0.0"
protocol_version = "HTTP/1.1"
def log_request(self, *args, **kwargs):
pass
def echo_app(environ, start_response):
n = 10000
start_response("200 Ok", [("Content-Type", "text/plain"),
("Content-Length", str(n))])
return ["*" * n]
# WSGIServerBase is single-threaded
server = httpserver.WSGIServerBase(echo_app, ("localhost", 8080), MyHandler)
server.handle_request()
用 来分析它
% python2.6 -m cProfile -o paste.prof paste_slowdown.py
并用它来击中它,
%httperf --client=0/1 --server=localhost --port=8080 --uri=/ \
--send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=500
我得到一个像这样的配置文件,
>>> p=pstats.Stats("paste.prof")
>>> p.strip_dirs().sort_stats("cumulative").print_stats()
Sun Nov 22 21:31:57 2009 paste.prof
109749 function calls in 46.570 CPU seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 46.571 46.571 {execfile}
1 0.001 0.001 46.570 46.570 paste_slowdown.py:2(<module>)
1 0.000 0.000 46.115 46.115 SocketServer.py:250(handle_request)
1 0.000 0.000 44.675 44.675 SocketServer.py:268(_handle_request_noblock)
1 0.000 0.000 44.675 44.675 SocketServer.py:301(process_request)
1 0.000 0.000 44.675 44.675 SocketServer.py:318(finish_request)
1 0.000 0.000 44.675 44.675 SocketServer.py:609(__init__)
1 0.000 0.000 44.675 44.675 httpserver.py:456(handle)
1 0.001 0.001 44.675 44.675 BaseHTTPServer.py:325(handle)
501 0.006 0.000 44.674 0.089 httpserver.py:440(handle_one_request)
2001 0.020 0.000 44.383 0.022 socket.py:373(readline)
501 44.354 0.089 44.354 0.089 {method 'recv' of '_socket.socket' objects}
1 1.440 1.440 1.440 1.440 {select.select}
....
您可以看到几乎所有时间都在recv中。
我决定放弃 httpref 并编写自己的 HTTP/1.1-with-keep-alive 请求并使用 netcat 发送它:
GET / HTTP/1.1
Location: localhost
Connection: Keep-Alive
Content-Length: 0
GET / HTTP/1.1
Location: localhost
Connection: Keep-Alive
Content-Length: 0
... repeat 97 more times, to have 99 keep-alives in total ...
GET / HTTP/1.1
Location: localhost
Connection: Close
Content-Length: 0
我发送的
nc localhost 8080 < ~/src/send_to_paste.txt
100 个请求的总时间为 0.03 秒,因此性能非常好。
这表明 httperf 做错了什么(但它是一段广泛使用且受人尊敬的代码),因此我尝试使用 'ab'
% ab -n 100 -k localhost:8080/
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)...
Server timed out
: Operation now in progress
检测服务器,它处理一个请求并等待第二个请求。
知道发生了什么事吗?
I have a web server based on paste.httpserver as an adapater between HTTP and WSGI. When I do performance measurements with httperf, I can do over 1,000 requests per second if I start a new request each time using --num-conn. If I instead reuse the connection using --num-call then I get about 11 requests per second, 1/100th of the speed.
If I try ab I get a timeout.
My tests are
% ./httperf --server localhost --port 8080 --num-conn 100
...
Request rate: 1320.4 req/s (0.8 ms/req)
...
and
% ./httperf --server localhost --port 8080 --num-call 100
...
Request rate: 11.2 req/s (89.4 ms/req)
...
Here's a simple reproducible server
from paste import httpserver
def echo_app(environ, start_response):
n = 10000
start_response("200 Ok", [("Content-Type", "text/plain"),
("Content-Length", str(n))])
return ["*" * n]
httpserver.serve(echo_app, protocol_version="HTTP/1.1")
It's a multi-threaded server, which is hard to profile. Here's a variation which is single threaded:
from paste import httpserver
class MyHandler(httpserver.WSGIHandler):
sys_version = None
server_version = "MyServer/0.0"
protocol_version = "HTTP/1.1"
def log_request(self, *args, **kwargs):
pass
def echo_app(environ, start_response):
n = 10000
start_response("200 Ok", [("Content-Type", "text/plain"),
("Content-Length", str(n))])
return ["*" * n]
# WSGIServerBase is single-threaded
server = httpserver.WSGIServerBase(echo_app, ("localhost", 8080), MyHandler)
server.handle_request()
Profiling that with
% python2.6 -m cProfile -o paste.prof paste_slowdown.py
and hitting it with
%httperf --client=0/1 --server=localhost --port=8080 --uri=/ \
--send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=500
I get a profile like
>>> p=pstats.Stats("paste.prof")
>>> p.strip_dirs().sort_stats("cumulative").print_stats()
Sun Nov 22 21:31:57 2009 paste.prof
109749 function calls in 46.570 CPU seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 46.571 46.571 {execfile}
1 0.001 0.001 46.570 46.570 paste_slowdown.py:2(<module>)
1 0.000 0.000 46.115 46.115 SocketServer.py:250(handle_request)
1 0.000 0.000 44.675 44.675 SocketServer.py:268(_handle_request_noblock)
1 0.000 0.000 44.675 44.675 SocketServer.py:301(process_request)
1 0.000 0.000 44.675 44.675 SocketServer.py:318(finish_request)
1 0.000 0.000 44.675 44.675 SocketServer.py:609(__init__)
1 0.000 0.000 44.675 44.675 httpserver.py:456(handle)
1 0.001 0.001 44.675 44.675 BaseHTTPServer.py:325(handle)
501 0.006 0.000 44.674 0.089 httpserver.py:440(handle_one_request)
2001 0.020 0.000 44.383 0.022 socket.py:373(readline)
501 44.354 0.089 44.354 0.089 {method 'recv' of '_socket.socket' objects}
1 1.440 1.440 1.440 1.440 {select.select}
....
You can see that nearly all the time is in a recv.
I decided to bail on httpref and write my own HTTP/1.1-with-keep-alive request and send it using netcat:
GET / HTTP/1.1
Location: localhost
Connection: Keep-Alive
Content-Length: 0
GET / HTTP/1.1
Location: localhost
Connection: Keep-Alive
Content-Length: 0
... repeat 97 more times, to have 99 keep-alives in total ...
GET / HTTP/1.1
Location: localhost
Connection: Close
Content-Length: 0
which I sent with
nc localhost 8080 < ~/src/send_to_paste.txt
Total time for 100 requests was 0.03 seconds, so it's very good performance.
This suggests that httperf is doing something wrong (but it's a widely used and respected piece of code), so I tried 'ab'
% ab -n 100 -k localhost:8080/
This is ApacheBench, Version 1.3d <$Revision: 1.73 gt; apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)...
Server timed out
: Operation now in progress
Instrumenting the server, it handles one request and is waiting for the second.
Any idea of what's going on?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
经过一番努力,似乎是 Nagle 算法 或延迟的 ACK,或交互他们之间。 之类的事情,它就会消失。
如果我做了类似“我是如何追踪它的?” 首先,我在socket.py 中检测了每个“recv”,这样我就可以找出哪个recv 正在等待。我发现 11 个接收中大约有 5 个有近 200 毫秒的延迟。我不明白为什么有任何延迟。然后我使用 Wireshark 观察消息,发现实际上是从服务器到客户端的发送有延迟。这意味着我的客户端发出的消息中存在 TCP 层的内容。
一位朋友提出了显而易见的建议,我搜索了“200ms套接字延迟”并找到了这个问题的描述。
粘贴跟踪报告位于 http://trac.pythonpaste.org/pythonpaste/ticket/392 以及当处理程序使用 HTTP/1.1 时启用 TCP_NODELAY 的补丁。
After some effort, it seems to be either Nagle's algorithm or the delayed ACK, or the interactions between them. It goes away if I do something like
How did I track it down? First, I instrumented every 'recv' in socket.py, so I could figure out which recv was waiting. I would see about 5 recv's out of 11 had a delay of almost 200ms. I couldn't figure out why there was any delay. I then used Wireshark to watch the messages and noticed that the it was actually the send from the server to the client which had the delay. That meant something in the TCP layer in the outgoing messages from my client.
A friend suggested the obvious, and I searched for "200ms socket delay" and found descriptions of this problem.
The paste trac report is at http://trac.pythonpaste.org/pythonpaste/ticket/392 along with a patch which enables TCP_NODELAY when the handler uses HTTP/1.1.