当前位置：文江博客话题详情

为什么某些语言没有实现边界检查？

发布于 2024-12-04 13:58:42 字数 241 浏览 2 评论 0原文

根据维基百科（http://en.wikipedia.org/wiki/Buffer_overflow）

通常与缓冲区溢出相关的编程语言包括 C 和 C++，它们没有提供内置保护来防止访问或覆盖内存任何部分中的数据，并且不会自动检查写入数组的数据（内置缓冲区类型) 在该数组的边界内。边界检查可以防止缓冲区溢出。

那么，为什么 C 和 C++ 等某些语言没有实现“边界检查”呢？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

穿越时光隧道 2024-12-11 13:58:42

基本上，这是因为这意味着每次更改索引时，都必须执行 if 语句。

让我们考虑一个简单的 C for 循环：

int ary[X] = {...};  // Purposefully leaving size and initializer unknown

for(int ix=0; ix< 23; ix++){
    printf("ary[%d]=%d\n", ix, ary[ix]);
}

如果我们有边界检查，则 ary[ix] 生成的代码必须类似于

LOOP:
    INC IX          ; add `1 to ix
    CMP IX, 23      ; while test
    CMP IX, X       ; compare IX and X
    JGE ERROR       ; if IX >= X jump to ERROR
    LD  R1, IX      ; put the value of IX into register 1
    LD  R2, ARY+IX  ; put the array value in R2
    LA  R3, Str42   ; STR42 is the format string
    JSR PRINTF      ; now we call the printf routine
    J   LOOP        ; go back to the top of the loop

;;; somewhere else in the code
ERROR:
    HCF             ; halt and catch fire

如果我们没有该边界检查，那么我们可以改为编写：

    LD R1, IX
LOOP:
    CMP IX, 23
    JGE END
    LD R2, ARY+R1
    JSR PRINTF
    INC R1
    J   LOOP

这可以在循环中节省 3-4 条指令，这（尤其是在过去）意义重大。

事实上，在PDP-11机器中，它甚至更好，因为有一种叫做“自动增量寻址”的东西。在 PDP 上，所有寄存器的东西等都变成了类似的东西

CZ  -(IX), END    ; compare IX to zero, then decrement; jump to END if zero

（任何碰巧比我更记得 PDP 的人，不要给我带来关于精确语法等的麻烦；你是一个像我一样的老家伙，你知道这些东西是如何消失的。）

Basically, it's because it means every time you change an index, you have to do an if statement.

Let's consider a simple C for loop:

int ary[X] = {...};  // Purposefully leaving size and initializer unknown

for(int ix=0; ix< 23; ix++){
    printf("ary[%d]=%d\n", ix, ary[ix]);
}

if we have bounds checking, the generated code for ary[ix] has to be something like

LOOP:
    INC IX          ; add `1 to ix
    CMP IX, 23      ; while test
    CMP IX, X       ; compare IX and X
    JGE ERROR       ; if IX >= X jump to ERROR
    LD  R1, IX      ; put the value of IX into register 1
    LD  R2, ARY+IX  ; put the array value in R2
    LA  R3, Str42   ; STR42 is the format string
    JSR PRINTF      ; now we call the printf routine
    J   LOOP        ; go back to the top of the loop

;;; somewhere else in the code
ERROR:
    HCF             ; halt and catch fire

If we don't have that bounds check, then we can write instead:

    LD R1, IX
LOOP:
    CMP IX, 23
    JGE END
    LD R2, ARY+R1
    JSR PRINTF
    INC R1
    J   LOOP

This saves 3-4 instructions in the loop, which (especially in the old days) meant a lot.

In fact, in the PDP-11 machines, it was even better, because there was something called "auto-increment addressing". On a PDP, all of the register stuff etc turned into something like

CZ  -(IX), END    ; compare IX to zero, then decrement; jump to END if zero

(And anyone who happens to remember the PDP better than I do, don't give me trouble about the precise syntax etc; you're an old fart like me, you know how these things slip away.)

回复收藏 0 原文

半岛未凉 2024-12-11 13:58:42

它更容易实现，并且编译和运行时速度更快。它还简化了语言定义（如果跳过的话，可以省略很多东西）。

目前，当你这样做时：

int *p = (int*)malloc(sizeof(int));
*p = 50;

C（和C++）只是说，“好吧，我会把一些东西放在内存中的那个位置”。

如果需要进行边界检查，C 必须说：“好吧，首先让我们看看是否可以在那里放一些东西？它已经分配了吗？是的？很好。我现在就插入。”通过跳过测试来查看是否有可以写入的内容，您可以节省一个非常昂贵的步骤。另一方面，（她戴着手套），我们现在生活在一个“优化是为那些买不起 RAM 的人”的时代，因此关于速度的争论变得越来越弱。

It is easier to implement and faster both to compile and at run-time. It also simplifies the language definition (as quite a few things can be left out if this is skipped).

Currently, when you do:

int *p = (int*)malloc(sizeof(int));
*p = 50;

C (and C++) just says, "Okey dokey! I'll put something in that spot in memory".

If bounds checking were required, C would have to say, "Ok, first let's see if I can put something there? Has it been allocated? Yes? Good. I'll insert now." By skipping the test to see whether there is something which can be written there, you are saving a very costly step. On the other hand, (she wore a glove), we now live in an era where "optimization is for those who cannot afford RAM," so the arguments about the speed are getting much weaker.

回复收藏 0 原文

萧瑟寒风 2024-12-11 13:58:42

一切都与性能有关。然而，C 和 C++ 没有边界检查的说法并不完全正确。每个库都有“调试”和“优化”版本是很常见的，并且在各种库的调试版本中启用边界检查的情况也并不罕见。

这样做的优点是可以在开发应用程序时快速、轻松地发现越界错误，同时消除运行 realz 程序时的性能影响。

我还应该补充一点，性能损失是不可忽略的，C++ 之外的许多语言将提供各种在缓冲区上操作的高级函数，这些函数直接在 C 和 C++ 中实现，专门用于避免边界检查。例如，在 Java 中，如果比较使用纯 Java 与使用 System.arrayCopy（执行一次边界检查，但随后直接复制数组而不对每个单独元素进行边界检查）将一个数组复制到另一个数组的速度，您会发现这两个操作的性能存在相当大的差异。

回复收藏 0 原文

江湖彼岸 2024-12-11 13:58:42

主要原因是向 C 或 C++ 添加边界检查的性能开销。虽然这种开销可以通过最先进的技术大幅减少（根据应用程序，可以减少 20-100% 的开销），但它仍然大得足以让许多人犹豫不决。我不确定这种反应是否合理——我有时怀疑人们过于关注绩效，仅仅因为绩效是可以量化和衡量的——但无论如何，这是生活的事实。这一事实降低了主要编译器投入精力将边界检查的最新工作集成到其编译器中的动力。

第二个原因涉及边界检查可能会破坏您的应用程序的担忧。特别是如果您使用违反标准的指针算术和强制转换进行了一些时髦的事情，则边界检查可能会阻止您的应用程序当前正在执行的操作。大型应用程序有时会做出令人惊讶的粗俗和丑陋的事情。如果编译器破坏了应用程序，那么将问题归咎于糟糕的代码是没有意义的；人们不会继续使用破坏他们的应用程序的编译器。

另一个主要原因是边界检查与 ASLR + DEP。 ASLR + DEP 被认为解决了，哦，80% 左右的问题。这减少了对全面边界检查的感知需求。

回复收藏 0 原文