当前位置：文江博客话题详情

C99 预处理器图灵完整吗？

发布于 2024-09-07 13:36:27 字数 178 浏览 14 评论 0原文

在发现 Boost 预处理器的功能之后，我发现自己想知道：C99 预处理器图灵完整吗？

如果没有的话，缺少什么才没有资格呢？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

别挽留 2024-09-14 13:36:28

这里是一个滥用预处理器来实现图灵机的示例。请注意，需要外部构建脚本将预处理器的输出反馈回其输入，因此预处理器本身并不是图灵完整的。尽管如此，这仍然是一个有趣的项目。

从上述链接项目的描述来看：

预处理器不是图灵完备的，至少不是如果
该程序仅预处理一次。即使
允许程序包含其自身。（原因是
对于给定的程序，预处理器只有有限的
状态数，加上由以下位置组成的堆栈
该文件已包含在.这只是一个下推自动机。）

Paul Fultz II 的答案令人印象深刻，而且肯定比我想象的预处理器所能得到的更接近，但它不是真正的图灵机。 C 预处理器有一定的限制，阻止它像图灵机一样执行任意程序，即使你有无限的内存和时间。 C 规范的第 5.2.4.1 节给出了C 编译器的以下最低限制：

完整表达式内带括号的表达式有 63 层嵌套
内部标识符或宏名称中的 63 个重要初始字符
在一个预处理翻译单元中同时定义 4095 个宏标识符
逻辑源代码行中包含 4095 个字符

下面的计数器机制要求每个值都有一个宏定义，因此宏定义限制将限制您可以循环的次数（EVAL(REPEAT(4100, M, ~)) 将产生未定义的行为）。这实质上限制了您可以执行的程序的复杂性。多级扩展的嵌套和复杂性也可能会遇到其他限制之一。

这与“无限内存”限制有根本的不同。在这种情况下，规范特别指出，符合标准的 C 编译器只需遵守这些限制，即使它具有无限的时间、内存等。任何超过这些限制的输入文件都可以以不可预测或未定义的方式进行处理（或直接拒绝）。某些实现可能有更高的限制，或者根本没有限制，但这被认为是“特定于实现的”，而不是标准的一部分。可能可以使用 Paul Fultz II 的方法在某些特定的编译器实现上实现类似图灵机的东西，没有有限的限制，但一般意义上“这可以在任何任意的、符合标准的 C99 预处理器”，答案是否定的。由于这里的限制是语言本身内置的，而不仅仅是我们无法构造无限计算机的副作用，所以我说这破坏了图灵完备性。

Here is an example of abusing the preprocessor to implement a Turing machine. Note that an external build script is needed to feed the preprocessor's output back into its input, so the preprocessor in and of itself isn't Turing complete. Still, it's an interesting project.

From the description of the afore-linked project:

the preprocessor is not Turing complete, at least not if
the program is preprocessed only once. This is true even if
the program is allowed to include itself. (The reason being
that for a given program, the preprocessor has only a finite
number of states, plus a stack consisting of the places which
the file has been included from. This is only a push-down automaton.)

The answer by Paul Fultz II is quite impressive and certainly closer than I thought the preprocessor could ever get, but it's not a true Turing machine. The C preprocessor has certain limits that prevent it from executing an arbitrary program like a Turing machine could, even if you had infinite memory and time. Section 5.2.4.1 of the C spec gives the following minimum limits for a C compiler:

63 nesting levels of parenthesized expressions within a full expression
63 significant initial characters in an internal identifier or a macro name
4095 macro identifiers simultaneously defined in one preprocessing translation unit
4095 characters in a logical source line

The counter mechanism below requires a macro definition per value, so the macro definition limit will limit how many times you can loop (EVAL(REPEAT(4100, M, ~)) would yield undefined behavior). This essentially puts a cap on the complexity of the program that you can execute. The nesting and complexity of the multi-level expansions may hit one of the other limits as well.

This is fundamentally different than the "infinite memory" limitation. In this case, the spec specifically says that a standards-conforming C compiler is only required to conform to these limits, even if it has infinite time, memory, etc. Any input file exceeding these limits can be processed in an unpredictable or undefined manner (or outright rejected). Some implementations may have higher limits, or no limits at all, but that's considered "implementation-specific" and not part of the standard. It may be possible to use Paul Fultz II's method to implement something like a Turing machine on some specific compiler implementation that has no finite limits, but in a general sense of "can this be done on any arbitrary, standards-conforming C99 pre-processor", the answer is no. Since the limit here is built into the language itself and not simply a side-effect of our inability to construct an infinite computer, I say that breaks Turing completeness.

回复收藏 0 原文

季末如歌 2024-09-14 13:36:28

为了实现图灵完备，需要定义可能永远不会完成的递归——我们称之为 mu -递归运算符。

为了定义这样的运算符，需要定义标识符的无限空间（以防每个标识符被评估有限次数），因为人们无法先验地知道结果的时间上限。被发现。由于代码中的运算符数量有限，因此需要能够检查无限数量的可能性。

因此，此类函数无法由 C 预处理器计算，因为在 C 预处理器中定义的宏数量有限，并且每个宏仅扩展一次。

C 预处理器使用 Dave Prosser 算法（由 Dave Prosser 于 1984 年为 WG14 团队编写）。在此算法中，宏在第一次扩展时被涂成蓝色；递归调用（或相互递归调用）不会扩展它，因为它在第一次扩展开始时已经被涂成蓝色。因此，使用有限数量的预处理行，不可能无限调用函数（宏），这正是 mu 递归运算符的特征。

C 预处理器只能计算 sigma-recursive 运算符。

详细信息请参见Marvin L. Minsky (1967) -- Computation: Finite and 的计算过程Infinite Machines，Prentice-Hall, Inc. 新泽西州恩格尔伍德悬崖等。

回复收藏 0 原文

零度℉ 2024-09-14 13:36:28

它在限制范围内是图灵完备的（所有计算机都是如此，因为它们没有无限的 RAM）。查看可以使用 Boost 预处理器执行的操作。

针对问题编辑进行编辑：

Boost 的主要限制是特定于编译器的最大宏扩展深度。另外，实现递归的宏（FOR...、ENUM...等）并不是真正的递归，它们只是由于一堆几乎相同的宏而显得如此。从整体上看，这种限制与实际递归语言中的最大堆栈大小没有什么不同。

对于有限的图灵完整性（图灵兼容性？）真正必要的唯一两件事是迭代/递归（等效构造）和条件分支。

回复收藏 0 原文

一紙繁鸢 2024-09-14 13:36:27

宏不会直接递归扩展，但我们可以通过一些方法来解决这个问题。

在预处理器中执行递归的最简单方法是使用延迟表达式。延迟表达式是需要更多扫描才能完全扩展的表达式：

#define EMPTY()
#define DEFER(id) id EMPTY()
#define OBSTRUCT(...) __VA_ARGS__ DEFER(EMPTY)()
#define EXPAND(...) __VA_ARGS__

#define A() 123
A() // Expands to 123
DEFER(A)() // Expands to A () because it requires one more scan to fully expand
EXPAND(DEFER(A)()) // Expands to 123, because the EXPAND macro forces another scan

为什么这很重要？当宏被扫描和扩展时，它会创建一个禁用上下文。此禁用上下文将导致引用当前扩展宏的标记被涂成蓝色。因此，一旦它被涂成蓝色，宏将不再扩展。这就是宏不递归扩展的原因。然而，禁用上下文仅在一次扫描期间存在，因此通过推迟扩展，我们可以防止宏被涂成蓝色。我们只需要对表达式应用更多扫描。我们可以使用这个 EVAL 宏来做到这一点：

#define EVAL(...)  EVAL1(EVAL1(EVAL1(__VA_ARGS__)))
#define EVAL1(...) EVAL2(EVAL2(EVAL2(__VA_ARGS__)))
#define EVAL2(...) EVAL3(EVAL3(EVAL3(__VA_ARGS__)))
#define EVAL3(...) EVAL4(EVAL4(EVAL4(__VA_ARGS__)))
#define EVAL4(...) EVAL5(EVAL5(EVAL5(__VA_ARGS__)))
#define EVAL5(...) __VA_ARGS__

现在，如果我们想使用递归实现一个 REPEAT 宏，首先我们需要一些递增和递减运算符来处理状态：

#define CAT(a, ...) PRIMITIVE_CAT(a, __VA_ARGS__)
#define PRIMITIVE_CAT(a, ...) a ## __VA_ARGS__

#define INC(x) PRIMITIVE_CAT(INC_, x)
#define INC_0 1
#define INC_1 2
#define INC_2 3
#define INC_3 4
#define INC_4 5
#define INC_5 6
#define INC_6 7
#define INC_7 8
#define INC_8 9
#define INC_9 9

#define DEC(x) PRIMITIVE_CAT(DEC_, x)
#define DEC_0 0
#define DEC_1 0
#define DEC_2 1
#define DEC_3 2
#define DEC_4 3
#define DEC_5 4
#define DEC_6 5
#define DEC_7 6
#define DEC_8 7
#define DEC_9 8

接下来我们需要还有一些宏来执行逻辑：

#define CHECK_N(x, n, ...) n
#define CHECK(...) CHECK_N(__VA_ARGS__, 0,)

#define NOT(x) CHECK(PRIMITIVE_CAT(NOT_, x))
#define NOT_0 ~, 1,

#define COMPL(b) PRIMITIVE_CAT(COMPL_, b)
#define COMPL_0 1
#define COMPL_1 0

#define BOOL(x) COMPL(NOT(x))

#define IIF(c) PRIMITIVE_CAT(IIF_, c)
#define IIF_0(t, ...) __VA_ARGS__
#define IIF_1(t, ...) t

#define IF(c) IIF(BOOL(c))

#define EAT(...)
#define EXPAND(...) __VA_ARGS__
#define WHEN(c) IF(c)(EXPAND, EAT)

现在有了所有这些宏，我们可以编写一个递归的 REPEAT 宏。我们使用 REPEAT_INDIRECT 宏来递归地引用自身。这可以防止宏被涂成蓝色，因为它将在不同的扫描上扩展（并使用不同的禁用上下文）。我们在这里使用OBSTRUCT，这将推迟两次扩展。这是必要的，因为条件 WHEN 已经应用了一次扫描。

#define REPEAT(count, macro, ...) \
    WHEN(count) \
    ( \
        OBSTRUCT(REPEAT_INDIRECT) () \
        ( \
            DEC(count), macro, __VA_ARGS__ \
        ) \
        OBSTRUCT(macro) \
        ( \
            DEC(count), __VA_ARGS__ \
        ) \
    )
#define REPEAT_INDIRECT() REPEAT

//An example of using this macro
#define M(i, _) i
EVAL(REPEAT(8, M, ~)) // 0 1 2 3 4 5 6 7

现在，由于计数器的限制，该示例仅限于 10 次重复。就像计算机中的重复计数器会受到有限内存的限制一样。可以将多个重复计数器组合在一起来解决此限制，就像在计算机中一样。此外，我们可以定义一个 FOREVER 宏：

#define FOREVER() \
    ? \
    DEFER(FOREVER_INDIRECT) () ()
#define FOREVER_INDIRECT() FOREVER
// Outputs question marks forever
EVAL(FOREVER())

这将尝试永远输出 ?，但最终会停止，因为没有更多的扫描被应用。现在的问题是，如果我们给它无限次扫描，这个算法能否完成？这被称为停机问题，图灵完备性对于证明停机问题的不可判定性是必要的。正如您所看到的，预处理器可以充当图灵完整语言，但它不受计算机有限内存的限制，而是受到所应用扫描的有限数量的限制。

Well macros don't directly expand recursively, but there are ways we can work around this.

The easiest way of doing recursion in the preprocessor is to use a deferred expression. A deferred expression is an expression that requires more scans to fully expand:

#define EMPTY()
#define DEFER(id) id EMPTY()
#define OBSTRUCT(...) __VA_ARGS__ DEFER(EMPTY)()
#define EXPAND(...) __VA_ARGS__

#define A() 123
A() // Expands to 123
DEFER(A)() // Expands to A () because it requires one more scan to fully expand
EXPAND(DEFER(A)()) // Expands to 123, because the EXPAND macro forces another scan

Why is this important? Well when a macro is scanned and expanding, it creates a disabling context. This disabling context will cause a token, that refers to the currently expanding macro, to be painted blue. Thus, once its painted blue, the macro will no longer expand. This is why macros don't expand recursively. However, a disabling context only exists during one scan, so by deferring an expansion we can prevent our macros from becoming painted blue. We will just need to apply more scans to the expression. We can do that using this EVAL macro:

#define EVAL(...)  EVAL1(EVAL1(EVAL1(__VA_ARGS__)))
#define EVAL1(...) EVAL2(EVAL2(EVAL2(__VA_ARGS__)))
#define EVAL2(...) EVAL3(EVAL3(EVAL3(__VA_ARGS__)))
#define EVAL3(...) EVAL4(EVAL4(EVAL4(__VA_ARGS__)))
#define EVAL4(...) EVAL5(EVAL5(EVAL5(__VA_ARGS__)))
#define EVAL5(...) __VA_ARGS__

Now if we want to implement a REPEAT macro using recursion, first we need some increment and decrement operators to handle state:

#define CAT(a, ...) PRIMITIVE_CAT(a, __VA_ARGS__)
#define PRIMITIVE_CAT(a, ...) a ## __VA_ARGS__

#define INC(x) PRIMITIVE_CAT(INC_, x)
#define INC_0 1
#define INC_1 2
#define INC_2 3
#define INC_3 4
#define INC_4 5
#define INC_5 6
#define INC_6 7
#define INC_7 8
#define INC_8 9
#define INC_9 9

#define DEC(x) PRIMITIVE_CAT(DEC_, x)
#define DEC_0 0
#define DEC_1 0
#define DEC_2 1
#define DEC_3 2
#define DEC_4 3
#define DEC_5 4
#define DEC_6 5
#define DEC_7 6
#define DEC_8 7
#define DEC_9 8

Next we need a few more macros to do logic:

#define CHECK_N(x, n, ...) n
#define CHECK(...) CHECK_N(__VA_ARGS__, 0,)

#define NOT(x) CHECK(PRIMITIVE_CAT(NOT_, x))
#define NOT_0 ~, 1,

#define COMPL(b) PRIMITIVE_CAT(COMPL_, b)
#define COMPL_0 1
#define COMPL_1 0

#define BOOL(x) COMPL(NOT(x))

#define IIF(c) PRIMITIVE_CAT(IIF_, c)
#define IIF_0(t, ...) __VA_ARGS__
#define IIF_1(t, ...) t

#define IF(c) IIF(BOOL(c))

#define EAT(...)
#define EXPAND(...) __VA_ARGS__
#define WHEN(c) IF(c)(EXPAND, EAT)

Now with all these macros we can write a recursive REPEAT macro. We use a REPEAT_INDIRECT macro to refer back to itself recursively. This prevents the macro from being painted blue, since it will expand on a different scan(and using a different disabling context). We use OBSTRUCT here, which will defer the expansion twice. This is necessary because the conditional WHEN applies one scan already.

#define REPEAT(count, macro, ...) \
    WHEN(count) \
    ( \
        OBSTRUCT(REPEAT_INDIRECT) () \
        ( \
            DEC(count), macro, __VA_ARGS__ \
        ) \
        OBSTRUCT(macro) \
        ( \
            DEC(count), __VA_ARGS__ \
        ) \
    )
#define REPEAT_INDIRECT() REPEAT

//An example of using this macro
#define M(i, _) i
EVAL(REPEAT(8, M, ~)) // 0 1 2 3 4 5 6 7

Now this example is limited to 10 repeats, because of limitations of the counter. Just like a repeat counter in a computer would be limited by the finite memory. Multiple repeat counters could be combined together to workaround this limitation, just like in a computer. Furthermore, we could define a FOREVER macro:

#define FOREVER() \
    ? \
    DEFER(FOREVER_INDIRECT) () ()
#define FOREVER_INDIRECT() FOREVER
// Outputs question marks forever
EVAL(FOREVER())

This will try to output ? forever, but will eventually stop because there are no more scans being applied. Now the question is, if we gave it an infinite number of scans would this algorithm complete? This is known as the halting problem, and Turing completeness is necessary to prove the undecidability of the halting problem. So as you can see, the preprocessor can act as a Turing complete language, but instead of being limited to the finite memory of a computer it is instead limited by the finite number of scans applied.

回复收藏 0 原文

~没有更多了~