获取编译.net语言的元数据
我正在尝试为一种针对 .Net 平台的新语言进行概念验证。我已经把词法分析/解析器的事情搞定了。尽管我很懒,但我只是使用 CodeDom 生成 C#,而不是此时发出 IL,然后进行编译。
但是,为了为我的语法生成正确的 C#,我需要能够为“编译器”提供可用的引用程序集的所有元数据,以便我可以查找所有类、方法、参数、接口等。什么是最好的解决这个问题的方法。
我想我可以将所有程序集加载到它们自己的应用程序域中并通过反射查询它们,但这似乎有点笨拙。我猜想的另一种方法是将所有元数据提取到可以轻松且高性能地加载和查询的内容中。
另一种方法是从系统 .Net 框架目录加载 XML 元数据文件,但这似乎也有点笨拙
似乎这应该是默认编译器本身会解决的问题,我是否缺少一些明显的方法来执行此操作?
编辑
此 CCI 元数据 可能是这样做的方法,但仍然好奇编译器是如何完成的
I'm toying with doing a proof of concept for a new language targeting the .Net platform. I've got the lexing/parser thing pretty much sorted. Lazy as I am I'm simply going to generate C# using the CodeDom rather than emitting IL at this point and then just compile that.
However, in order to generate the correct C# for my syntax I need to be able to have all the metadata of referenced assemblies available for the "compiler" so I can look up all the classes, methods, parameters, interfaces etc. What's the best way to go about this.
I guess I could load all the assemblies into their own appdomain and query them through reflection but it seems a bit clunky. Another way I guess would be to extract all metadata into something that can be loaded and queried easily and performantly.
Another way would be to load the XML metadata files from the system .Net framework directory but that seems a bit clunky as well
It seems this should be a problem that the default compiler itself would address, am I missing something obvious way to do this?
EDIT
This CCI metadata might be the way to do it, but still curious how it's done by the compiler
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
C# 和 VB 编译器有一个内部库,可以从 PE 格式中读取元数据并以其原始形式进行解释。我认为许多托管库都做同样的事情——元数据格式是开放的并且有详细的文档记录(只需搜索 ECMA CLI 规范),它比通过 CLR 更安全(因为代码被读取为位而不是加载到运行时间)并且通常更快。
如果/当团队完成他们讨论的“编译器即服务”长期计划时,编译器团队实际上可能会直接提供一些东西,但那是:a)将在未来某个模糊且不确定的时间,b) 纯粹是我的猜测。所以现在,我会看看其他人指出的一些库。
The C# and VB compilers have an internal library that reads the metadata out of the PE format and interprets it in it's raw form. I think many of the managed libraries do the same thing -- the metdata format is open and well documented (just search for the ECMA CLI spec), it's safer than going through the CLR (since the code is read as bits and not loaded into a runtime) and generally faster.
If/when the teams finish on their discussed "compiler as a service" long-term plans, there might be something that's actually available directly from the compiler team, but that's: a) going to be at some vague and undetermined time in the future, and b) purely speculation on my part. So for now, I'd look at some of the libraries that other people are pointing to.
您可以尝试
Assembly.ReflectionOnlyLoad
。这将仅加载请求的程序集,而不加载依赖项。但是,卸载仍然是不可能的(不卸载整个应用程序域)。You can try
Assembly.ReflectionOnlyLoad
. This will only load the requested assembly, without loading dependencies. However, unloading is still impossible (without unloading a whole appdomain).完全托管库的最佳方法是使用 Cecil,如项目页面上所述:
它得到维护,开源,并且许可证使其甚至可以在商业项目中使用。
The best way for fully managed libraries would be to use Cecil as described on the project page :
It's maintained, open source and the license make it usable even in commercial projects.
编译器通常会像任何其他数据文件一样加载程序集(使用已经提到的 Cecil 或 CCI 元数据库)。与运行时的反射支持相比,它更快并且使用更少的内存。
Compilers will typically load assemblies like any other data file (using something like the Cecil or CCI metadata libraries mentioned already). It's faster and uses less memory than the runtime's reflection support.