编译器可以在没有有关纯度的类型信息的情况下自动检测纯函数吗?

发布于 2024-12-25 19:56:12 字数 228 浏览 2 评论 0原文

所以我和我的朋友争论,他声称像 GCC 这样的编译器可以自动检测纯函数,而不需要任何类型信息。我对此表示怀疑。

像 D 或 Haskell 这样的语言在其类型系统中具有纯粹性,程序员明确定义什么函数是纯粹的或不是纯粹的。纯函数没有副作用,因此可以很容易地并行化。

所以问题是:这一切有必要吗?编译器是否可以在没有任何元或类型信息的情况下,仅通过假设任何自动执行 IO 或访问全局变量的内容不是纯粹的来检测纯度?

So I'm arguing with my friend who claims that a compiler like GCC can detect a pure function automatically without any type information. I doubt that.

Languages like D or Haskell have purity in their type systems and a programmer explicitly defines what function is pure or not. A pure function has no side effects and can therefore very easily be parallelized.

So the question is: Is this all necessary or not? Could a compiler detect purity, without any meta or type information, just by assuming that anything that does IO or accesses global variables automatically is not pure?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

绿光 2025-01-01 19:56:12

当然,在某些情况下您可以检测纯函数。例如,

int f(int x)
{
    return x*2;
}

可以通过简单的静态分析来检测为纯的。一般来说,困难在于做到这一点,检测使用“内部”状态但外部纯净的接口基本上是不可能的。

GCC 确实有警告选项 -Wsuggest-attribute=pure-Wsuggest-attribute=const,建议可能是 pureconst 候选的函数 属性。我不确定它是否选择保守(即缺少许多纯函数,但从不建议它用于非纯函数)或让用户决定。

请注意,GCC 对 pure 的定义是“仅依赖于参数和全局变量”:

许多函数除了返回值之外没有任何影响,并且它们的返回值仅取决于参数和/或全局变量。这样的函数可以像算术运算符一样进行公共子表达式消除和循环优化。这些函数应使用属性 pure 进行声明。

GCC 手册

严格的纯度,即相同参数的相同结果所有情况下,都由 const 属性表示,但这样的函数甚至不能取消引用传递给它的指针。因此,pure 函数的并行化机会是有限的,但与用 Haskell 等语言编写的纯函数相比,可以使用 const 的函数要少得多。

顺便说一句,自动并行化纯函数并不像您想象的那么容易;困难的部分是决定要并行化什么。并行计算成本太低,而且开销使其毫无意义。并行化不够,您就无法获得好处。我不知道有任何实用的函数式语言实现可以因此自动并行化,尽管像 repa 这样的库并行化许多幕后操作在用户代码中没有明确的并行性。

Sure, you can detect pure functions in some cases. For instance,

int f(int x)
{
    return x*2;
}

can be detected as pure with simple static analysis. The difficulty is doing this in general, and detecting interfaces which use "internal" state but are externally pure is basically impossible.

GCC does have the warning options -Wsuggest-attribute=pure and -Wsuggest-attribute=const, which suggest functions that might be candidates for the pure and const attributes. I'm not sure whether it opts to be conservative (i.e. missing many pure functions, but never suggesting it for a non-pure function) or lets the user decide.

Note that GCC's definition of pure is "depending only on arguments and global variables":

Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure.

GCC manual

Strict purity, i.e. the same results for the same arguments in all circumstances, is represented by the const attribute, but such a function cannot even dereference a pointer passed to it. So the parallelisation opportunities for pure functions are limited, but much fewer functions can be const compared to the pure functions you can write in a language like Haskell.

By the way, automatically parallelising pure functions is not as easy as you might think; the hard part becomes deciding what to parallelise. Parallelise computations that are too cheap, and overhead makes it pointless. Don't parallelise enough, and you don't reap the benefits. I don't know of any practical functional language implementation that does automatic parallelisation for this reason, although libraries like repa parallelise many operations behind the scenes without explicit parallelism in the user code.

紫竹語嫣☆ 2025-01-01 19:56:12

还有一个问题。考虑

int isthispure(int i) {
   if (false) return getchar();
   return i + 42;
}

该函数实际上是纯的,尽管它包含不纯的代码,但无法到达此代码。
现在假设 falseg(i) 替换,但我们非常确定 g(i) 为 false(例如,g 可能会检查其参数是否为 Lychrel 数)。
为了证明 isthispure 确实是纯的,编译器必须证明不存在 Lychrel 数。

(我承认这是一个相当理论化的考虑。人们也可以决定,如果一个函数包含任何不纯的代码,它本身就是不纯的。但恕我直言,C 类型系统并不能证明这一点。)

There is another problem. Consider

int isthispure(int i) {
   if (false) return getchar();
   return i + 42;
}

The function is effectively pure, though it contains impure code, but this code cannot be reached.
Now suppose false is replaced by g(i) but we know quite sure that g(i) is false (for example, g might check if its argument is a Lychrel number).
To prove that isthispure is indeed pure, the compiler would have to prove that no Lychrel numbers exist.

(I admit that this is a quite theoretical consideration. One could also decide that if a function contains any impure code, it is itself impure. But this is not justified by the C type system, IMHO.)

暗喜 2025-01-01 19:56:12

确定一个函数是否是纯函数(即使在 GCC 使用的有限意义上)相当于暂停问题,因此答案是“不适用于任意函数”。可以自动检测某些函数是纯函数,其他函数不是纯函数,并将其余函数标记为“未知”,这在某些情况下允许自动并行化。

根据我的经验,即使是程序员也不太擅长弄清楚这些事情,所以我希望类型系统能够帮助我跟踪它,而不仅仅是优化器。

Determining if a function is pure (even in the limited sense used by GCC) is equivalent to the halting problem, so the answer is "not for arbitrary functions." It is possible to automatically detect that some functions are pure, others are not pure, and flag the rest as "unknown", which allows for automatic parallelization in some cases.

In my experience, even programmers aren't very good at figuring out such things, so I want the type system to help keep track of it for me, not just for the optimizer.

指尖微凉心微凉 2025-01-01 19:56:12

我在写一篇文章 比较 C# 和 C++ 性能时发现 Visual C++ 确实可以检测中等复杂度的纯函数,同时调用计算的函数一个多项式

我在循环中调用多项式函数来消耗时钟上的时间。编译器优化了该调用,使其在循环开始之前运行一次,并在循环内重复使用结果。要做到这一点,就必须知道没有副作用。

不过,我必须说,能够通过将函数标记为纯函数来保证编译器可以进行优化,并且它也可以作为文档的一种形式,这很好。

I discovered while writing an article comparing C# and C++ performance that Visual C++ can indeed detect a pure function of moderate complexity, while calling a function that computed a polynomial.

I called the polynomial function in a loop to eat up time on the clock. The compiler optimized the call to run once before the loop started and re-use the result within the loop. To do that it would have to know that there are no side-effects.

I must say though, is nice to be able to guarantee that the compiler can do an optimization, by marking a function as pure, and it serves as a form of documentation, too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文