NodeJS:处理 TCP 套接字流的正确方法是什么?我应该使用哪个分隔符?
根据我此处的理解,“V8 有一个分代垃圾收集器。随机移动对象。Node 不能获取指向原始字符串数据的指针以写入套接字。”因此,我不应该将来自 TCP 流的数据存储在字符串中,特别是当该字符串大于 Math.pow(2,16)
字节时。 (希望到目前为止我是对的..)
那么处理来自 TCP 套接字的所有数据的最佳方法是什么?到目前为止,我一直在尝试使用 _:_:_
作为分隔符,因为我认为它在某种程度上是唯一的,并且不会弄乱其他东西。
即将到来的数据样本将是某物_:_:_可能是一个大文本_:_:_可能是大量行_:_:_更多和更多数据
这就是我试图做的:
net = require('net');
var server = net.createServer(function (socket) {
socket.on('connect',function() {
console.log('someone connected');
buf = new Buffer(Math.pow(2,16)); //new buffer with size 2^16
socket.on('data',function(data) {
if (data.toString().search('_:_:_') === -1) { // If there's no separator in the data that just arrived...
buf.write(data.toString()); // ... write it on the buffer. it's part of another message that will come.
} else { // if there is a separator in the data that arrived
parts = data.toString().split('_:_:_'); // the first part is the end of a previous message, the last part is the start of a message to be completed in the future. Parts between separators are independent messages
if (parts.length == 2) {
msg = buf.toString('utf-8',0,4) + parts[0];
console.log('MSG: '+ msg);
buf = (new Buffer(Math.pow(2,16))).write(parts[1]);
} else {
msg = buf.toString() + parts[0];
for (var i = 1; i <= parts.length -1; i++) {
if (i !== parts.length-1) {
msg = parts[i];
console.log('MSG: '+msg);
} else {
buf.write(parts[i]);
}
}
}
}
});
});
});
server.listen(9999);
每当我尝试console.log('MSG' + msg),它会打印出整个缓冲区,因此查看是否有效是没有用的。
我怎样才能以正确的方式处理这些数据?即使该数据不是面向行的,惰性模块也会工作吗?是否有其他模块可以处理非面向行的流?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
确实有人说有额外的工作正在进行,因为 Node 必须获取该缓冲区,然后将其推入 v8/将其转换为字符串。然而,在缓冲区上执行 toString() 并没有更好。据我所知,目前还没有好的解决方案,特别是如果你的最终目标是得到一根绳子并摆弄它。这是 Ryan 提到@nodeconf 作为需要完成工作的领域之一。
至于分隔符,你可以选择任何你想要的。许多二进制协议选择包含固定标头,这样您就可以将内容放入正常结构中,其中很多时候包含长度。通过这种方式,您可以分割已知的标头并获取有关其余数据的信息,而无需迭代整个缓冲区。通过这样的方案,可以使用如下工具:
顺便说一句,缓冲区可以通过数组语法访问,也可以使用 .slice() 将它们分割开。
最后,请检查此处:https://github.com/joyent/node/wiki/modules -- 找到一个解析简单 tcp 协议并且似乎做得很好的模块,然后阅读一些代码。
It has indeed been said that there's extra work going on because Node has to take that buffer and then push it into v8/cast it to a string. However, doing a toString() on the buffer isn't any better. There's no good solution to this right now, as far as I know, especially if your end goal is to get a string and fool around with it. Its one of the things Ryan mentioned @ nodeconf as an area where work needs to be done.
As for delimiter, you can choose whatever you want. A lot of binary protocols choose to include a fixed header, such that you can put things in a normal structure, which a lot of times includes a length. In this way, you slice apart a known header and get information about the rest of the data without having to iterate over the entire buffer. With a scheme like that, one can use a tool like:
As an aside, buffers can be accessed via array syntax, and they can also be sliced apart with .slice().
Lastly, check here: https://github.com/joyent/node/wiki/modules -- find a module that parses a simple tcp protocol and seems to do it well, and read some code.
您应该使用新的stream2 api。 http://nodejs.org/api/stream.html
以下是一些非常有用的示例: https://github.com/substack/stream-handbook
https://github.com/lvgithub/stick
You should use the new stream2 api. http://nodejs.org/api/stream.html
Here are some very useful examples: https://github.com/substack/stream-handbook
https://github.com/lvgithub/stick