C#/.NET 词法分析器生成器
我正在寻找一个像样的 C#/.NET 词汇扫描器生成器——它支持 Unicode 字符类别,并生成一些可读的 & 字符集。 高效的代码。 有人知道其中一个吗?
编辑:我需要支持Unicode类别,而不仅仅是Unicode字符。 目前仅 Lu(字母,大写)类别就有 1421 个字符,我需要非常具体地匹配许多不同的类别,并且宁愿不手写所需的字符集。
另外,实际代码是必须——这排除了生成二进制文件然后与驱动程序(即GOLD)一起使用的事情
编辑:ANTLR不支持Unicode类别尚未。 有一个未决问题不过,所以有一天它可能会满足我的需求。
I'm looking for a decent lexical scanner generator for C#/.NET -- something that supports Unicode character categories, and generates somewhat readable & efficient code. Anyone know of one?
EDIT: I need support for Unicode categories, not just Unicode characters. There are currently 1421 characters in just the Lu
(Letter, Uppercase) category alone, and I need to match many different categories very specifically, and would rather not hand-write the character sets necessary for it.
Also, actual code is a must -- this rules out things that generate a binary file that is then used with a driver (i.e. GOLD)
EDIT: ANTLR does not support Unicode categories yet. There is an open issue for it, though, so it might fit my needs someday.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我同意@David Robbins 的观点,ANTLR 可能是你最好的选择。 但是,生成的 ANTLR 代码确实需要单独的运行时库才能使用生成的代码,因为生成的代码依赖于一些字符串解析和其他库共性。 ANTLR 生成一个词法分析器和一个解析器。
附注:
ANTLR 很棒...我编写了 400 多行语法来生成超过 10k 或 C# 代码,以有效地解析语言。 这包括对语言解析中可能出错的每个可能问题进行内置错误检查。 尝试手动完成此操作,您将永远无法跟上错误。
I agree with @David Robbins, ANTLR is probably your best bet. However, the generated ANTLR code does need a seperate runtime library in order to use the generated code because there are some string parsing and other library commonalities that the generated code relies on. ANTLR generates a lexer AND a parser.
On a side note:
ANTLR is great...I wrote a 400+ line grammar to generate over 10k or C# code to efficiently parse a language. This included built in error checking for every possible thing that could go wrong in the parsing of the language. Try to do that by hand, and you'll never keep up with the bugs.
我刚刚发现这个
http://www.seclab.tuwien.ac.at /projects/cuplex/lex.htm
它表示它的可配置性足以支持 unicode ;-)。
赫伯
I just found this
http://www.seclab.tuwien.ac.at/projects/cuplex/lex.htm
It says that it's configurable enough to support unicode ;-).
Herber
GPLEX 似乎支持您的要求。
GPLEX seems to support your requirements.
我想到的两个解决方案是 ANTLR 和 黄金。 ANTLR 有一个基于 GUI 的语法设计器,以及一个优秀的 C# 示例项目可以在这里找到< /a>.
The two solutions that come to mind are ANTLR and Gold. ANTLR has a GUI based grammar designer, and an excellent sample project in C# can be found here.