为什么其他语言不支持类似于 C 及其后代的预处理器指令？

发布于 2024-09-09 01:41:20 字数 328 浏览 0 评论 0原文

我想知道为什么其他语言不支持这个功能。据我所知，C / C++ 代码是平台相关的，因此要使其在各种平台上工作（编译和执行），是通过使用预处理器指令来实现的。除此之外，它还有很多其他用途。就像您可以将所有调试 printf 放在 #if DEBUG ... #endif 中一样。因此，在进行发布构建时，这些代码行不会在二进制文件中进行编译。
但在其他语言中，实现这一目标（后面的部分）很困难（或者可能是不可能的，我不确定）。所有代码都将编译为二进制文件，从而增加其大小。所以我的问题是“为什么 Java 或其他现代编译语言不支持这种功能？”它允许您以一种非常方便的方式从二进制文件中包含或排除某些代码。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一笑百媚生 2024-09-16 01:41:20

没有预处理器的主要语言通常有不同的、通常更干净的方法来实现相同的效果。

拥有像 cpp 这样的文本预处理器是一件好事。由于cpp实际上并不了解C，它所做的只是将文本转换为其他文本。这会导致许多维护问题。以 C++ 为例，其中预处理器的许多用途已被明确弃用，以支持更好的功能，例如：

对于常量，使用 const 而不是 #define
对于小函数，inline 而不是 #define 宏

C++ 常见问题称宏为邪恶，并给出了避免使用它们的多种理由。

回复收藏 0 原文

贱人配狗天长地久 2024-09-16 01:41:20

滥用的可能性远远超过了预处理器的可移植性优势。以下是我在行业中看到的真实代码的一些示例：

函数体与#ifdef如此纠缠在一起 很难阅读该函数并弄清楚发生了什么。请记住，预处理器使用文本而不是语法，因此您可以做一些非常不语法的事情
代码可能会在的不同分支中重复#ifdef，使得很难维护关于正在发生的事情的单一事实点。
当应用程序面向多个平台时，编译所有代码变得非常困难，而不是为开发人员的平台选择任何代码。您可能需要设置多台机器。（比如说，在 BSD 系统上建立一个精确模拟 GNU 头文件的交叉编译环境是昂贵的。）在大多数 Unix 品种都是专有的、供应商必须支持它们的时代，这个问题非常严重。如今，有如此多的 Unix 版本都是免费的，尽管在 Unix 环境中复制本机 Windows 标头仍然相当具有挑战性，但问题已经不那么严重了。
有些代码受到如此多的#ifdef保护，您无法弄清楚选择该代码需要什么-D选项组合 这个问题是 NP 难题，因此最著名的解决方案需要尝试指数级的许多不同的定义组合。这当然是不切实际的，因此真正的后果是您的系统逐渐充满尚未编译的代码。这个问题会扼杀重构，当然，这样的代码完全不受单元测试和回归测试的影响——除非您建立了一个巨大的多平台测试场，甚至可能也不会。
在现场，我发现这个问题会导致这样的情况：重构的应用程序经过仔细测试和交付后，却立即收到错误报告，而该应用程序甚至无法在其他平台上编译。如果代码被#ifdef隐藏并且我们无法选择它，我们就不能保证它进行类型检查，甚至不能保证它在语法上是正确的。

硬币的另一面是，更先进的语言和编程技术减少了预处理器中条件编译的需求：

对于某些语言，例如 Java，全部与平台相关的代码位于 JVM 和相关库的实现中。人们不遗余力地构建独立于平台的 JVM 和库。
在许多语言中，例如 Haskell、Lua、Python、Ruby 等，与 C 相比，设计者花费了一些功夫来减少平台相关的代码量。
在现代语言中，您可以将平台相关代码放在一个编译接口后面的单独编译单元。许多现代编译器具有跨接口边界内联函数的良好设施，因此您不必为这种抽象付出太多（或任何）代价。 C 的情况并非如此，因为 (a) 没有单独编译的接口；单独编译模型假设#include 和预处理器； (b) C 编译器在具有 64K 代码空间和 64K 数据空间的机器上成熟；一个复杂到足以跨模块边界内联的编译器几乎是不可想象的。今天，这样的编译器已经很常见了。一些高级编译器动态地内联和专门化方法。

总结：通过使用语言机制（而不是文本替换）来隔离依赖于平台的代码，您可以将所有代码暴露给编译器，所有内容至少都会经过类型检查，并且您有机会执行静态分析等操作以确保适当的测试覆盖率。您还排除了一大堆导致代码不可读的编码实践。

The portability benefits of the preprocessor are far outweighed by the possibilities for abuse. Here are some examples from real codes I have seen in industry:

A function body becomes so tangled with #ifdef that it is very hard to read the function and figure out what is going on. Remember that the preprocessor works with text not syntax, so you can do things that are wildly ungrammatical
Code can become duplicated in different branches of an #ifdef, making it hard to maintain a single point of truth about what's going on.
When an application is intended for multiple platforms, it becomes very hard to compile all the code as opposed to whatever code happens to be selected for the developer's platform. You may need to have multiple machines set up. (It is expensive, say, on a BSD system to set up a cross-compilation environment that accurately simulates GNU headers.) In the days when most varieties of Unix were proprietary and vendors had to support them all, this problem was very serious. Today when so many versions of Unix are free, it's less of a problem, although it's still quite challenging to duplicate native Windows headers in a Unix environment.
It Some code is protected by so many #ifdefs that you can't figure out what combination of -D options is needed to select the code. The problem is NP-hard, so the best known solutions require trying exponentially many different combinations of definitions. This is of course impractical, so the real consequence is that gradually your system fills with code that hasn't been compiled. This problem kills refactoring, and of course such code is completely immune to your unit tests and your regression tests—unless you set up a huge, multiplatform testing farm, and maybe not even then.
In the field, I have seen this problem lead to situations where a refactored application is carefully tested and shipped, only to receive immediate bug reports that the application won't even compile on other platforms. If code is hidden by #ifdef and we can't select it, we have no guarantee that it typechecks—or even that it is syntactically correct.

The flip side of the coin is that more advanced languages and programming techniques have reduced the need for conditional compilation in the preprocessor:

For some languages, like Java, all the platform-dependent code is in the implementation of the JVM and in the associated libraries. People have gone to huge lengths to make JVMs and libraries that are platform-independent.
In many languages, such as Haskell, Lua, Python, Ruby, and many more, the designers have gone to some trouble to reduce the amount of platform-dependent code compared to C.
In a modern language, you can put platform-dependent code in a separate compilation unit behind a compiled interface. Many modern compilers have good facilities for inlining functions across interface boundaries, so that you don't pay much (or any) penalty for this kind of abstraction. This wasn't the case for C because (a) there are no separately compiled interfaces; the separate-compilation model assumes #include and the preprocessor; and (b) C compilers came of age on machines with 64K of code space and 64K of data space; a compiler sophisticated enough to inline across module boundaries was almost unthinkable. Today such compilers are routine. Some advanced compilers inline and specialize methods dynamically.

Summary: by using linguistic mechanisms, rather than textual replacement, to isolate platform-dependent code, you expose all your code to the compiler, everything gets type-checked at least, and you have a chance of doing things like static analysis to ensure suitable test coverage. You also rule out a whole bunch of coding practices that lead to unreadable code.

回复收藏 0 原文

始终不够爱げ你 2024-09-16 01:41:20

由于现代编译器足够智能，可以在大多数情况下删除死代码，因此不再需要以这种方式手动输入编译器。以下操作而不是 :

#include <iostream>

#define DEBUG

int main()
{
#ifdef DEBUG
        std::cout << "Debugging...";
#else
        std::cout << "Not debugging.";
#endif
}

即，您可以执行

#include <iostream>

const bool debugging = true;

int main()
{
    if (debugging)
    {
        std::cout << "Debugging...";
    }
    else
    {
        std::cout << "Not debugging.";
    }
}

：并且您可能会得到相同或至少相似的代码输出。

编辑/注意：在 C 和 C++ 中，我绝对不会这样做——我会使用预处理器，如果没有别的办法，它会让我的代码的读者立即清楚地知道其中的一部分不应该须在一定条件下遵守。然而，我想说的是，这就是许多语言避开预处理器的原因。

Because modern compilers are smart enough to remove dead code in most any case, making manually feeding the compiler this way no longer necessary. I.e. instead of :

#include <iostream>

#define DEBUG

int main()
{
#ifdef DEBUG
        std::cout << "Debugging...";
#else
        std::cout << "Not debugging.";
#endif
}

you can do:

#include <iostream>

const bool debugging = true;

int main()
{
    if (debugging)
    {
        std::cout << "Debugging...";
    }
    else
    {
        std::cout << "Not debugging.";
    }
}

and you'll probably get the same, or at least similar, code output.

Edit/Note: In C and C++, I'd absolutely never do this -- I'd use the preprocessor, if nothing else that it makes it instantly clear to the reader of my code that a chunk of it isn't supposed to be complied under certain conditions. I am saying, however, that this is why many languages eschew the preprocessor.

回复收藏 0 原文

佼人 2024-09-16 01:41:20

一个更好的问题是为什么 C 语言要使用预处理器来实现这些类型的元编程任务？它与其说是一个功能，不如说是对当时技术的妥协。

C 语言的预处理器指令是在机器资源（CPU 速度、RAM）稀缺（且昂贵）的时期开发的。预处理器提供了一种在内存有限的慢速机器上实现这些功能的方法。例如，我拥有的第一台机器有 56KB RAM 和 2Mhz CPU。它仍然有一个完整的 K&RC 编译器可用，这将系统资源推向极限，但仍然可行。

更现代的语言利用当今更强大的机器来提供更好的方法来处理预处理器用来处理的各种元编程任务。

回复收藏 0 原文

比忠 2024-09-16 01:41:20

其他语言通过使用通用预处理器（例如 m4）来支持此功能。

我们真的希望每种语言都有自己的执行前文本替换实现吗？

回复收藏 0 原文

悟红尘 2024-09-16 01:41:20

C 预处理器可以在任何文本文件上运行，不一定是 C。

当然，如果在其他语言上运行，它可能会以奇怪的方式进行标记，但对于像 #ifdef 这样的简单块结构DEBUG，您可以将其放入任何语言中，在其上运行 C 预处理器，然后在其上运行您的语言特定编译器，它就会工作。

回复收藏 0 原文

公布 2024-09-16 01:41:20

请注意，宏/预处理/条件/等通常被视为编译器/解释器功能，而不是语言功能，因为它们通常完全独立于正式语言定义，并且同一语言的编译器和编译器实现之间可能有所不同。

在许多语言中，条件编译指令比 if-then-else 运行时代码更好的情况是编译时语句（例如变量声明）需要有条件。例如，

$if debug
array x
$endif
...
$if debug
dump x
$endif

仅在需要 x 时声明/分配/编译 x，而

array x
boolean debug
...
if debug then dump x

无论 debug 是否为 true，都可能必须声明 x。

Note that macros/preprocessing/conditionals/etc are usually considered a compiler/interpreter feature, as opposed to a language feature, because they are usually completely independent of the formal language definition, and might vary from compiler to compiler implementation for the same language.

A situation in many languages where conditional compilation directives can be better than if-then-else runtime code is when compile-time statements (such as variable declarations) need to be conditional. For example

$if debug
array x
$endif
...
$if debug
dump x
$endif

only declares/allocates/compiles x when needing x, whereas

array x
boolean debug
...
if debug then dump x

probably has to declare x regardless of whether debug is true.

回复收藏 0 原文

长梦不多时 2024-09-16 01:41:20

许多现代语言实际上具有远远超出 CPP 的语法元编程功能。例如，几乎所有现代 Lisp（Arc、Clojure、Common Lisp、Scheme、newLISP、Qi、PLOT、MISC 等）都具有极其强大的宏系统（实际上是图灵完备的），因此为什么他们要把自己限制在蹩脚的 CPP 风格的宏上，这些宏甚至不是真正的宏，只是文本片段？

其他具有强大语法元编程功能的语言包括 Io、Ioke、Perl 6、OMeta、Converge。

回复收藏 0 原文

迷爱 2024-09-16 01:41:20

因为减小二进制文件的大小：

可以通过其他方式完成（例如，比较 C++ 可执行文件与 C# 可执行文件的平均大小）。
当你权衡是否能够编写实际工作的程序时，这并不重要。

回复收藏 0 原文

弱骨蛰伏 2024-09-16 01:41:20

其他语言也有更好的动态绑定。例如，我们有一些代码由于出口原因无法运送给某些客户。我们的“C”库使用#ifdef 语句和详细的Makefile 技巧（几乎相同）。

Java 代码使用插件（ala Eclipse），因此我们不发布该代码。

您可以通过使用共享库在 C 中执行相同的操作...但预处理器要简单得多。

回复收藏 0 原文

丿*梦醉红颜 2024-09-16 01:41:20

其他人没有提到的另一点是平台支持。

大多数现代语言不能在与 C 或 C++ 相同的平台上运行，而且也不打算在该平台上运行。例如，Java、Python 以及 C# 等本机编译语言需要堆，它们被设计为在具有内存管理、库和大量空间的操作系统上运行，它们不在独立环境中运行。在那里您可以使用其他方法来存档相同的内容。 C 可用于对具有 2KiB ROM 的控制器进行编程，大多数应用程序都需要一个预处理器。

回复收藏 0 原文

~没有更多了~