使用 asyncore 读取套接字缓冲区
我是 Python 新手(尽管我已经使用 Java 编程多年),并且我正在开发一个简单的基于套接字的网络应用程序(只是为了好玩)。我的想法是,我的代码连接到远程 TCP 端点,然后侦听从服务器推送到客户端的任何数据,并对此执行一些解析。
从服务器推送的数据-> client 是 UTF-8 编码的文本,每行由 CRLF
(\x0D\x0A
) 分隔。您可能猜到了:这个想法是客户端连接到服务器(直到被用户取消),然后读取并解析进入的行。
我已经设法让它工作,但是,我没有确保我以正确的方式做这件事。因此,我的实际问题(要遵循的代码):
- 这是在 Python 中执行此操作的正确方法吗(即,它真的这么简单吗)?
- 有关缓冲区/
asyncore
的任何提示/技巧/有用资源(除了参考文档)?
目前,数据的读取和缓冲如下:
def handle_read(self):
self.ibuffer = b""
while True:
self.ibuffer += self.recv(self.buffer_size)
if ByteUtils.ends_with_crlf(self.ibuffer):
self.logger.debug("Got full line including CRLF")
break
else:
self.logger.debug("Buffer not full yet (%s)", self.ibuffer)
self.logger.debug("Filled up the buffer with line")
print(str(self.ibuffer, encoding="UTF-8"))
ByteUtils.ends_with_crlf
函数只是检查缓冲区的最后两个字节是否有 \x0D\x0A
。第一个问题是主要问题(答案基于此),但任何其他想法/提示都值得赞赏。谢谢。
I'm new to Python (I have been programming in Java for multiple years now though), and I am working on a simple socket-based networking application (just for fun). The idea is that my code connects to a remote TCP end-point and then listens for any data being pushed from the server to the client, and perform some parsing on this.
The data being pushed from server -> client is UTF-8 encoded text, and each line is delimited by CRLF
(\x0D\x0A
). You probably guessed: the idea is that the client connects to the server (until cancelled by the user), and then reads and parses the lines as they come in.
I've managed to get this to work, however, I'm not sure that I'm doing this quite the right way. So hence my actual questions (code to follow):
- Is this the right way to do it in Python (ie. is it really this simple)?
- Any tips/tricks/useful resources (apart from the reference documentation) regarding buffers/
asyncore
?
Currently, the data is being read and buffered as follows:
def handle_read(self):
self.ibuffer = b""
while True:
self.ibuffer += self.recv(self.buffer_size)
if ByteUtils.ends_with_crlf(self.ibuffer):
self.logger.debug("Got full line including CRLF")
break
else:
self.logger.debug("Buffer not full yet (%s)", self.ibuffer)
self.logger.debug("Filled up the buffer with line")
print(str(self.ibuffer, encoding="UTF-8"))
The ByteUtils.ends_with_crlf
function simply checks the last two bytes of the buffer for \x0D\x0A
. The first question is the main one (answer is based on this), but any other ideas/tips are appreciated. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
TCP 是一个流,并且不能保证您的缓冲区不会包含一条消息的结尾和下一条消息的开头。
因此,在缓冲区末尾检查 \n\r 不会在所有情况下都按预期工作。您必须检查流中的每个字节。
而且,我强烈建议您使用 Twisted 而不是 asyncore。
像这样的东西(凭记忆,可能无法开箱即用):
TCP is a stream, and you are not guaranteed that your buffer will not contain the end of one message and the beginning of the next.
So, checking for \n\r at the end of the buffer will not work as expected in all situations. You have to check each byte in the stream.
And, I would strongly recommend that you use Twisted instead of asyncore.
Something like this (from memory, might not work out of the box):
甚至更简单 - 查看 asynchat 及其 set_terminator 方法(以及该模块中的其他有用的花絮)。 Twisted 更加丰富和强大,但是,对于足够简单的任务, asyncore 和 asynchat (它们是正如您已经开始观察到的那样,它们的使用确实非常简单。
It's even simpler -- look at asynchat and its set_terminator method (and other helpful tidbits in that module). Twisted is orders of magnitude richer and more powerful, but, for sufficiently simple tasks, asyncore and asynchat (which are designed to interoperate smoothly) are indeed very simple to use, as you've started observing.