处理 pycurl 挂在 Twitter 流 api 上
我正在使用 pycurl 连接到 Twitter 流 API。
这很有效,但有时运行几个小时后它会无限期地停止挂起,不会抛出任何异常。如何检测/处理此脚本中的挂起?
import pycurl, json
STREAM_URL = "http://stream.twitter.com/1/statuses/filter.json"
USER = "presidentskroob"
PASS = "12345"
def on_receive(data):
print data
conn = pycurl.Curl()
conn.setopt(pycurl.USERPWD, "%s:%s" % (USER, PASS))
conn.setopt(pycurl.URL, STREAM_URL)
conn.setopt(pycurl.WRITEFUNCTION, on_receive)
conn.perform()
I am using pycurl to connect to the twitter streaming API.
This works well but sometimes after running for a few hours it will stop hang indefinitely, not throwing any exceptions. How can I detect/handle a hang in this script?
import pycurl, json
STREAM_URL = "http://stream.twitter.com/1/statuses/filter.json"
USER = "presidentskroob"
PASS = "12345"
def on_receive(data):
print data
conn = pycurl.Curl()
conn.setopt(pycurl.USERPWD, "%s:%s" % (USER, PASS))
conn.setopt(pycurl.URL, STREAM_URL)
conn.setopt(pycurl.WRITEFUNCTION, on_receive)
conn.perform()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
来自: http://man-wiki.net/index.php/3:curl_easy_setopt
和
例子:
FROM: http://man-wiki.net/index.php/3:curl_easy_setopt
and
Example:
如果传输速度在给定时间长度内低于给定阈值,curl 开关 --speed-limit 允许您让curl 返回错误。不幸的是,速度阈值不能设置为小于 1 的值,并且 Twitter Streaming API 的理想值是 1/30,因为它每 30 秒发送一个字符以保持活动状态。您能做的最好的事情是使用 1 Bps 的阈值,但是每当不活动时间(无推文)长于您选择的持续时间时,curl 就会放弃。如果下面的命令在 30 秒内接收到的字节数少于 30 个字节,则该命令将放弃。
总结一下:仅使用curl中的选项没有令人满意的解决方案。
The curl switch --speed-limit allows you to have curl return an error if the transfer speed dips below a given threshold for a given length of time. Unfortunately, the speed threshold cannot be set to values less than one, and the ideal value for the Twitter Streaming API would be 1/30 since it sends a single character every 30 seconds for its keep alive. The best you can do is used a threshold of 1 Bps, but then curl will give up whenever there is a period of inactivity (no tweets) longer than the duration you select. The command below will give up if there is a 30 second period during which it receives less than 30 bytes.
To summarize: no satisfactory solution using just the options in of curl.
您可以使用超时设置:
如果 curl 超时,您将收到 pycurl.error 异常。
You can use the timeout settings:
You'll get a pycurl.error exception if curl times out.
我有预感这可能与“tcp 管道损坏”场景有关。即另一个对等点在某个时刻关闭了连接,但我们的对等点以某种方式忽略了该事件。您将需要使用某种保活来处理这个问题。
问题的“正确”、优雅的解决方案可能需要 Twitter 本身采取一些行动。这是一个相当普遍的问题;我的朋友使用了streaming api并遇到了同样的问题。
I have a premonition that this could be related to "tcp broken pipe" scenario. I.e. the other peer at some moment closes the connection, but our peer somehow ignores the event. You will need to use some kind of keep-alives to deel with this.
The "right", elegant solution of the problem may require some actions from twitter itself. This is rather common issue; my friend have used the streaming api and encountered the same problem.