文本解析库
我的一位同事致力于开发一个基于 C# lambda 的通用文本解析库。核心看起来很酷,但不幸的是,对我来说,他硬编码了一个语法,专门针对他的私人任务——数学表达式评估。所以,我不会像看到 API 之前那样使用它。现在我正在寻找另一个库,它至少可以满足我的一些要求。它必须:
- 能够从外部文件加载语法 - 例如 XML、YML或 JSON。
- 从语法和从任何文本构建的解析树中返回 AST。
- 工作速度足够快,可以加载 C# 语法,然后解析大型代码文件。
我更喜欢具有足够简单的语法格式文件的库,以便轻松编写数学表达式的语法,并且是开源的并用 C# 或 C++ 编写。
问候,
- 更新:第 2 点已更正。
A colleague of mine works on an universal text parsing library, based on C# lambdas. The core looks cool, but unfortunately to me he has hardcoded a grammar, specifical to his private task -- math expression evaluating. So, I will not use it as I had intended before I saw the API. And now I'm looking for another lib, that meets at least some of my requirements. It has to:
- Be able to load a grammar from an external file -- say, XML, YML or JSON.
- Return AST from grammar and parsed tree that is built from any text.
- Work fast enough to load C# grammar then parse a large code file.
I'd prefer the library that has grammar format file simple enough for easy writing a grammar for math expressions, is open source and written in C# or C++.
Regards,
--
UPDATED: point 2 has been corrected.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以查看 Text Transformer,它声称是某种通用文本处理语言。我对此没有具体的经验。
构建强大的语言前端和可用的处理工具实际上是一项艰巨的工作。
如果您想以通用方式处理计算机语言,您可以考虑我们的DMS 软件重组工具包< /a>,一种用于解析、分析、转换和/或生成代码(或任何其他类型的正式文档)的通用编译器技术。
DMS 将接受语言的任意上下文无关语法,自动构建 AST,无需您进行额外的规范工作,并且旨在不仅可以处理大型文件,还可以在单次计算中处理非常大的文件集。通常人
想要处理代码需要模式识别、代码分析和代码转换能力; DMS 内置了所有这些功能。它还具有各种预定义的、成熟的语法,适用于各种计算机语言,众所周知的(C、C++、C#、COBOL、Java、JavaScript...)和其他(Natural) 、EGL、Python、MATLAB 等),并已用于对这些不同语言的程序进行大规模自动化分析和转换。
DMS 不满足您的开源或 C#/C++ 实施要求。它被实现为一组特定于领域的语言,用于描述语法、分析器、转换、漂亮打印机和脚本,允许并行执行,使复杂的分析比单线程程序运行得更快。
You might check out Text Transformer which claims to be some kind of universal text processing language. I have no specific experience with it.
Building robust langauge front ends and usable processing tools is actually a lot of work.
If you want to process computer languages in a generic way, you might consider our DMS Software Reengineering Toolkit, a kind of generalized compiler technology for parsing, analyzing, transforming, and/or generating code (or any other kind of formal document).
DMS will accept arbitrary context free grammars for langauges, automatically builds an AST with no additional specification effort on your part, and is designed to handle not only large files but very large sets of files in a single computation. Normally people
that want to process code need pattern recognition, code analysis and code transformation capabilities; DMS has all of these built in. It also has a variety of predefined, mature grammars for a wide variety of computer langauges, well-known (C, C++, C#, COBOL, Java, JavaScript, ... ) and otherwise (Natural, EGL, Python, MATLAB, ...), and has been used to carry out massive automated analyses and transformations on programs in these various langauges.
DMS does not meet your open-source or C#/C++ implementation requirements. It is implemented as a set of domain-specific langauges for describing grammars, analyzers, transformations, prettyprinters, and scripting that allows parallel execution to enable complex analyses to run faster than single-threaded programs.