Node.js 中面向行的流
我正在使用 Node.js 开发一个多进程应用程序。在此应用程序中,父进程将生成一个子进程,并通过管道使用基于 JSON 的消息传递协议与其进行通信。我发现大型 JSON 消息可能会被“切断”,这样发送到管道上的数据侦听器的单个“块”不包含完整的 JSON 消息。此外,小的 JSON 消息可以分组在同一块中。每个 JSON 消息都将由换行符分隔,因此我想知道是否已经有一个实用程序可以缓冲管道读取流,以便一次发出一行(因此,对于我的应用程序,一个 JSON 文档一次)。这似乎是一个非常常见的用例,所以我想知道它是否已经完成。
我很感激任何人都可以提供的指导。谢谢。
I'm developing a multi-process application using Node.js. In this application, a parent process will spawn a child process and communicate with it using a JSON-based messaging protocol over a pipe. I've found that large JSON messages may get "cut off", such that a single "chunk" emitted to the data listener on the pipe does not contain the full JSON message. Furthermore, small JSON messages may be grouped in the same chunk. Each JSON message will be delimited by a newline character, and so I'm wondering if there is already a utility that will buffer the pipe read stream such that it emits one line at a time (and hence, for my application, one JSON document at a time). This seems like it would be a pretty common use case, so I'm wondering if it has already been done.
I'd appreciate any guidance anyone can offer. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
也许 Pedro 的运营商可以帮助您?
Maybe Pedro's carrier can help you?
我对这个问题的解决方案是发送 JSON 消息,每个消息都以一些特殊的 unicode 字符结尾。通常不会在 JSON 字符串中获取的字符。称之为术语。
所以发送者只需执行“JSON.stringify(message) + TERM;”并写下它。
然后,接收方根据 TERM 拆分传入数据,并使用 JSON.parse() 解析这些部分,速度相当快。
诀窍是最后一条消息可能无法解析,因此我们只需保存该片段并在下一条消息到来时将其添加到其开头。接收代码是这样的:
“片段”在数据块之间持续存在的地方定义。
但什么是术语?我使用了 unicode 替换字符“\uFFFD”。人们还可以使用 twitter 使用的技术,其中消息由 '\r\n' 分隔,推文使用 '\n' 作为新行,并且从不包含 '\r\n'
我发现这比搞乱要简单得多包括长度等。
My solution to this problem is to send JSON messages each terminated with some special unicode character. A character that you would never normally get in the JSON string. Call it TERM.
So the sender just does "JSON.stringify(message) + TERM;" and writes it.
The reciever then splits incomming data on the TERM and parses the parts with JSON.parse() which is pretty quick.
The trick is that the last message may not parse, so we simply save that fragment and add it to the beginning of the next message when it comes. Recieving code goes like this:
Where "fragment" is defined somwhere where it will persist between data chunks.
But what is TERM? I have used the unicode replacement character '\uFFFD'. One could also use the technique used by twitter where messages are separated by '\r\n' and tweets use '\n' for new lines and never contain '\r\n'
I find this to be a lot simpler than messing with including lengths and such like.
最简单的解决方案是在每条消息之前发送 json 数据的长度作为固定长度前缀(4 字节?),并使用一个简单的非成帧解析器来缓冲小块或分割较大的块。
您可以尝试 node-binary 以避免手动编写解析器。查看 scan(key, buffer) 文档示例 - 它完全是逐行读取。
Simplest solution is to send length of json data before each message as fixed-length prefix (4 bytes?) and have a simple un-framing parser which buffers small chunks or splits bigger ones.
You can try node-binary to avoid writing parser manually. Look at
scan(key, buffer)
documentation example - it does exactly line-by line reading.只要换行符(或您使用的任何分隔符)仅分隔 JSON 消息而不嵌入其中,您就可以使用以下模式:
As long as newlines (or whatever delimiter you use) will only delimit the JSON messages and not be embedded in them, you can use the following pattern: