当前位置：文江博客话题详情

哪些语言习语/范式/功能使得添加对“类型提供程序”的支持变得困难？

发布于 2024-12-04 17:12:29 字数 198 浏览 4 评论 0原文

F# 3.0 添加了类型提供程序。

我想知道是否可以将此语言功能添加到 C# 等在 CLR 上运行的其他语言，或者此功能是否仅适用于功能性更强/更少的 OO 编程风格？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

萌化 2024-12-11 17:12:29

正如托马斯所说，理论上将这种功能添加到任何静态类型语言中都是很简单的（尽管仍然有很多繁重的工作）。

我不是元编程专家，但@SK-logic 问为什么不使用通用的编译时元编程系统，我将尝试回答。我认为使用元编程无法轻松实现使用 F# 类型提供程序所做的事情，因为 F# 类型提供程序在设计时可能是惰性的和动态交互的。让我们举一个 Don 在他之前的视频中演示过的例子：一个 Freebase 类型的提供商。 Freebase 有点像一个系统化的、可编程的维基百科，它包含所有内容的数据。因此，您最终可以编写类似

for e in Freebase.Science.``Chemical Elements`` do
    printfn "%d: %s - %s" e.``Atomic number`` e.Name e.Discoverer.Name

或类似的代码（我没有即时的确切代码），但也可以轻松编写获取有关棒球统计信息的代码，或者当著名演员在戒毒所中时，或者通过 Freebase 可以获得无数其他类型的信息。

从实现的角度来看，为所有 Freebase 生成一个架构并将其先验地引入 .NET 是不可行的；您不能在一开始就只执行一个编译时步骤来设置所有这些。您可以对小型数据源执行此操作，事实上许多其他类型提供程序都使用此策略，例如 SQL 类型提供程序指向数据库，并为该数据库中的所有类型生成 .NET 类型。但这种策略对于像 Freebase 这样的大型云数据存储来说并不适用，因为有太多相互关联的类型（如果您尝试为所有 Freebase 生成 .NET 元数据，您会发现有这么多数百万种类型（其中之一是带有 AtomicNumber 和 Discoverer 和 Name 以及许多其他字段的 ChemicalElement，但是实际上有数百万种这样的类型）需要比 32 位 .NET 进程可用的内存更多的内存来表示整个类型架构，

因此 F# 类型提供程序策略是一种 API 架构，允许类型提供程序按需提供信息，并在设计时运行。在您输入例如 Freebase.Science. 之前，类型提供程序不需要了解科学类别下的实体，但一旦您按下 . 后，代码>科学，然后类型提供者可以通过查询 API 来了解整体架构的上一层，了解 Science 下存在哪些类别，其中之一是 ChemicalElements。然后，当您尝试“点入”其中之一时，它会发现元素具有原子序数之类的东西。因此，类型提供程序会懒惰地获取足够的整体架构来处理用户当时恰好在编辑器中输入的确切代码。因此，用户仍然可以自由地探索信息世界的任何部分，但任何一个源代码文件或交互式会话都只能探索可用内容的一小部分。当进行编译/代码生成时，编译器只需要生成足够的代码来准确容纳用户在其代码中实际使用的位，而不是潜在的巨大运行时位来提供与整个数据存储进行通信的可能性。

（也许你现在可以使用当今的一些元编程工具来做到这一点，我不知道，但我很久以前在学校学到的那些工具无法轻松处理这个问题。）

As Tomas says, it is theoretically straightforward to add this kind of feature to any statically-typed language (though still a lot of grunt-work).

I am not a meta-programming expert, but @SK-logic asks why not a general compile-time meta-programming system instead, and I shall try to answer. I don't think you can easily achieve what you can do with F# type providers using meta-programming, because F# type providers can be lazy and dynamically interactive at design-time. Let's give an example that Don has demo-ed in one of his earlier videos: a Freebase type provider. Freebase is kind of like a schematized, programmable wikipedia, it has data on everything. So you can end up writing code along the lines of

for e in Freebase.Science.``Chemical Elements`` do
    printfn "%d: %s - %s" e.``Atomic number`` e.Name e.Discoverer.Name

or whatnot (I don't have the exact code offhand), but just as easily write code that gets information about baseball statistics, or when famous actors have been in drug rehab facilities, or a zillion other types of information available through Freebase.

From an implementation point-of-view, it is infeasible to generate a schema for all of Freebase and bring it into .NET a-priori; you can't just do one compile-time step at the beginning to set all this up. You can do this for small data sources, and in fact many other type providers use this strategy, e.g. a SQL type provider gets pointed at a database, and generates .NET types for all the types in that database. But this strategy does not work for large cloud data stores like Freebase, because there are too many interrelated types (if you tried to generate .NET metadata for all of Freebase, you'd find that there are so many millions of types (one of which is ChemicalElement with AtomicNumber and Discoverer and Name and many other fields, but there are literally millions of such types) that you need more memory than is available to a 32-bit .NET process just to represent the entire type schema.

So the F# type-provider strategy is an API architecture that allows type providers to supply information on-demand, running at design-time within the IDE. Until you type e.g. Freebase.Science., the type provider does not need to know about the entities under the science categories, but once you do press . after Science, then the type provider can go and query the APIs to learn one-more-level of the overall schema, to know what categories exist under Science, one of which is ChemicalElements. And then as you try to "dot into" one of those, it will discover that elements have atomic numbers and what-not. So the type provider lazily fetches just enough of the overall schema to deal with the exact code the user happens to be typing into the editor at that moment in time. As a result, the user still has the freedom to explore any part of the universe of information, but any one source code file or interactive session will only explore a tiny fraction of what is available. When it comes time to compile/codegen, the compiler need only generate enough code to accomodate exactly the bits that the user has actually used in his code, rather than the potentially huge runtime bits to afford the possibility of talking to the whole data store.

(Maybe you can do that with some of today's meta-programming facilities now, I don't know, but the ones I learned about in school a long while back could not have easily handled this.)

回复收藏 0 原文

无法回应 2024-12-11 17:12:29

正如 Brian 和 Tomas 指出的那样，这个功能没有什么特别“实用”的地方。这是向编译器提供元数据的一种特别巧妙的方式。

C# 设计团队长期以来一直在思考这样的想法。在我加入 C# 团队的几年前，有人提出了一项称为“类型蓝图”（或类似名称）的功能，其中 XML 文档、XML 模式和提供类型元数据的自定义代码的组合可以由 C# 编译器使用。我不记得细节，显然它从未实现。（尽管它确实影响了我当时正在研究的 Visual Studio Tools for Office 文档格式的设计和实现。）

无论如何，我们短期内没有计划向 C# 添加这样的功能，但我们正在饶有兴趣地观察它是否能很好地解决 F# 中的客户问题。

（与往常一样，埃里克对未宣布的和完全假设的产品未来可能出现的功能的思考仅供娱乐。）

回复收藏 0 原文