滥用拉格尔,可能需要新的方法/工具
我正在尝试使用 Ragel 来实现一个简单的是/否 fsm。不幸的是,语言规范由大约一千个正则表达式的并集组成,其中 * 运算符在大多数正则表达式中出现一次或多次。因此,可能的状态数量呈爆炸式增长,似乎不可能使用 Ragel 为我的语言生成 fsm。是否有一个工具可以满足我的需要,或者我应该交换方法?我需要比依次对照每个正则表达式检查输入字符串更好的东西。我可以将上千个正则表达式分成大约 50 个块,并为每个块生成一个 fsm,然后针对所有机器运行每个输入字符串,但是如果有一个工具可以处理此类工作而不需要这样的 hack,我会很高兴听到这个消息。
谢谢!
I'm trying to use Ragel to implement a simple yes/no fsm. Unfortunately the language specification consists of the union of about a thousand regular expressions, with * operators appearing once or more in the majority of them. So, the number of possible states explodes and it seems it will be impossible to use Ragel to generate an fsm for my language. Is there a tool out there that can do what I need, or should I swap approaches? I need something better than checking input strings against each regular expression in turn. I could chop up the thousand regular expressions into chunks of ~50 and generate an fsm for each, and run every input string against all the machines, but if there's a tool that can handle this kind of job without such a hack I'd be pleased to hear of it.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,我最终将机器分成了多台机器,以防止 Ragel 吃掉所有可用内存 - 事实上,我不得不将机器分成几个单独的 Ragel 文件,因为生成的 java 类有太多常量从生成的巨大状态表中。如果有人有更好的解决方案,我仍然有兴趣听到更好的解决方案!
Well, I've ended up breaking the machine into multiple machines in order to prevent Ragel from eating all available memory - in fact, I had to break up the machine into a couple of separate Ragel files because the generated java class had too many constants in it from the huge state tables generated. I'm still interested in hearing of a better solution for this, if anybody has one!