在端口上使用 PARSE!价值

发布于 2024-10-01 17:33:41 字数 377 浏览 0 评论 0原文

我尝试在端口上使用 PARSE!它不起作用:

>> parse open %test-data.r [to end]  
** Script error: parse does not allow port! for its input argument

当然,如果您读取以下数据,它就会起作用:

>> parse read open %test-data.r [to end]  
== true

...但是能够在大文件上使用 PARSE 而无需先将它们加载到内存中似乎会很有用。

PARSE 无法在 PORT 上工作是否有原因? ...或者只是还没有实施?

I tried using PARSE on a PORT! and it does not work:

>> parse open %test-data.r [to end]  
** Script error: parse does not allow port! for its input argument

Of course, it works if you read the data in:

>> parse read open %test-data.r [to end]  
== true

...but it seems it would be useful to be able to use PARSE on large files without first loading them into memory.

Is there a reason why PARSE couldn't work on a PORT! ... or is it merely not implemented yet?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蝶舞 2024-10-08 17:33:41

简单的答案是不,我们不能...

解析的工作方式,它可能需要回滚到输入字符串的先前部分,这实际上可能是完整的头部输入,当它遇到流的最后一个字符时。

端口在从端口获取输入时将其数据复制到字符串缓冲区,因此实际上,永远不会有任何“先前”字符串可供解析回滚。它就像量子物理学......只是看着它,它不再存在了。

但正如你在 rebol 中所知...“不”并不是答案。 ;-)

话虽这么说,有一种方法可以在抓取端口时解析来自端口的数据,但需要做更多的工作。

你所做的就是使用缓冲区,并且

APPEND buffer COPY/part connection amount

根据你的数据,数量可以是 1 字节或 1kb,使用有意义的。

将新输入添加到缓冲区后,对其进行解析并添加逻辑以了解您是否匹配该缓冲区的部分

如果某些内容完全匹配,则从缓冲区中删除/部分匹配的内容,然后继续解析,直到没有解析为止。

然后重复上面的操作,直到到达输入末尾。

我在实时 EDI tcp 服务器中使用了它,该服务器具有“始终打开”的 tcp 端口,以便分解(可能)连续的输入数据流,这实际上是端到端地背负消息。

详细信息

设置此系统的最佳方法是使用 /no-wait 并循环直到端口关闭(您不会收到任何消息,而不是“”)。

还要确保在解析时有一种方法来检查数据完整性问题(例如跳过的字节或错误消息),否则,您将永远到达结尾。

在我的系统中,当缓冲区超出特定大小时,我尝试了一种替代规则,该规则会跳过字节,直到在流中进一步找到可能的模式。如果发现,则会记录错误,存储部分消息,并向系统管理员发出警报以整理消息。

哈!

the easy answer is no we can't...

The way parse works, it may need to roll-back to a prior part of the input string, which might in fact be the head of the complete input, when it meets the last character of the stream.

ports copy their data to a string buffer as they get their input from a port, so in fact, there is never any "prior" string for parse to roll-back to. its like quantum physics... just looking at it, its not there anymore.

But as you know in rebol... no isn't an answer. ;-)

This being said, there is a way to parse data from a port as its being grabbed, but its a bit more work.

what you do is use a buffer, and

APPEND buffer COPY/part connection amount

Depending on your data, amount could be 1 byte or 1kb, use what makes sense.

Once the new input is added to your buffer, parse it and add logic to know if you matched part of that buffer.

If something positively matched, you remove/part what matched from the buffer, and continue parsing until nothing parses.

you then repeat above until you reach the end of input.

I've used this in a real-time EDI tcp server which has an "always on" tcp port in order to break up a (potentially) continuous stream of input data, which actually piggy-backs messages end to end.

details

The best way to setup this system is to use /no-wait and loop until the port closes (you receive none instead of "").

Also make sure you have a way of checking for data integrity problems (like a skipped byte, or erroneous message) when you are parsing, otherwise, you will never reach the end.

In my system, when the buffer was beyond a specific size, I tried an alternate rule which skipped bytes until a pattern might be found further down the stream. If one was found, an error was logged, the partial message stored and a alert raised for sysadmin to sort out the message.

HTH !

笨死的猪 2024-10-08 17:33:41

我认为马克西姆的回答已经足够好了。此时端口解析尚未实现。我觉得以后实施也不是不行,但是我们必须先解决其他问题。

正如马克西姆所说,你现在就可以做到,但这很大程度上取决于你到底想做什么。

当然,您可以解析大文件,而无需将它们完全读取到内存中。知道您期望解析什么总是好的。例如,所有大文件(例如音乐和视频文件)都被分为块,因此您可以使用 copy|seek 来获取这些块并解析它们。

或者,如果您只想获取多个网页的标题,您可以只读取前 1024 个字节并在此处查找标题标签,如果失败,则读取更多字节并重试...

这正是必须执行的操作无论如何,都应该允许本地解析端口。

请随意在 CureCode 数据库中添加愿望:http://curecode.org/rebol3/<​​/a>

I think that Maxim's answer is good enough. At this moment the parse on port is not implemented. I don't think it's impossible to implement it later, but we must solve other issues first.

Also as Maxim says, you can do it even now, but it very depends what exactly you want to do.

You can parse large files without need to read them completely to the memory, for sure. It's always good to know, what you expect to parse. For example all large files, like files for music and video, are divided into chunks, so you can just use copy|seek to get these chunks and parse them.

Or if you want to get just titles of multiple web pages, you can just read, let's say, first 1024 bytes and look for the title tag here, if it fails, read more bytes and try it again...

That's exactly what must be done to allow parse on port natively anyway.

And feel free to add a WISH in the CureCode database: http://curecode.org/rebol3/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文