应该 C++程序员避免memset?
我听说c++程序员应该避免memset,
class ArrInit {
//! int a[1024] = { 0 };
int a[1024];
public:
ArrInit() { memset(a, 0, 1024 * sizeof(int)); }
};
所以考虑上面的代码,如果你不使用memset,你怎么能让[1..1024]填充零?C++中的memset有什么问题?
谢谢。
I heard a saying that c++ programmers should avoid memset,
class ArrInit {
//! int a[1024] = { 0 };
int a[1024];
public:
ArrInit() { memset(a, 0, 1024 * sizeof(int)); }
};
so considering the code above,if you do not use memset,how could you make a[1..1024] filled with zero?Whats wrong with memset in C++?
thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
在 C++ 中,
std::fill
或std::fill_n
可能是更好的选择,因为它是通用的,因此可以对对象和 POD 进行操作。但是,memset 对原始字节序列进行操作,因此永远不应该用于初始化非 POD。无论如何,如果类型是 POD,std::fill
的优化实现可能会在内部使用专门化来调用memset
。In C++
std::fill
orstd::fill_n
may be a better choice, because it is generic and therefore can operate on objects as well as PODs. However,memset
operates on a raw sequence of bytes, and should therefore never be used to initialize non-PODs. Regardless, optimized implementations ofstd::fill
may internally use specialization to callmemset
if the type is a POD.问题不在于在内置类型上使用 memset(),而在于在类(也称为非 POD)类型上使用它们。这样做几乎总是会做错误的事情,并且经常会做致命的事情——例如,它可能会破坏虚拟函数表指针。
The issue is not so much using memset() on the built-in types, it is using them on class (aka non-POD) types. Doing so will almost always do the wrong thing and frequently do the fatal thing - it may, for example, trample over a virtual function table pointer.
零初始化应该如下所示:
至于使用 memset,有几种方法可以使使用更加健壮(与所有此类函数一样): 避免对数组的大小和类型进行硬编码:
对于额外的编译时检查也可以确保
a
确实是一个数组(因此sizeof(a)
有意义):但是对于非字符类型,我想唯一的您使用它填充的值是 0,并且零初始化应该已经可以通过某种方式使用。
Zero-initializing should look like this:
As to using memset, there are a couple of ways to make the usage more robust (as with all such functions): avoid hard-coding the array's size and type:
For extra compile-time checks it is also possible to make sure that
a
indeed is an array (sosizeof(a)
would make sense):But for non-character types, I'd imagine the only value you'd use it to fill with is 0, and zero-initialization should already be available in one way or another.
C++ 中的
memset
的问题与 C 中的memset
的问题基本相同。memset
用物理零位模式填充内存区域,而实际上,几乎 100% 的情况下,您需要用相应类型的逻辑零值填充数组。在 C 语言中,memset
仅保证正确初始化整数类型的内存(并且它对所有整数类型(而不是仅 char 类型)的有效性是一个相对较新的保证添加到 C 语言规范)。不保证将任何浮点值正确设置为零,也不保证产生正确的空指针。当然,上述内容可能被认为过于迂腐,因为给定平台上活跃的附加标准和约定可能(而且肯定会)扩展 memset 的适用性,但我仍然建议遵循奥卡姆剃刀原则:不要依赖任何其他标准和约定,除非确实必须这样做。 C++ 语言(以及 C)提供了多种语言级功能,使您可以使用正确类型的正确零值安全地初始化聚合对象。其他答案已经提到了这些功能。
What's wrong with
memset
in C++ is mostly the same thing that's wrong withmemset
in C.memset
fills memory region with physical zero-bit pattern, while in reality in virtually 100% of cases you need to fill an array with logical zero-values of corresponding type. In C language,memset
is only guaranteed to properly initialize memory for integer types (and its validity for all integer types, as opposed to just char types, is a relatively recent guarantee added to C language specification). It is not guaranteed to properly set to zero any floating point values, it is not guaranteed to produce proper null-pointers.Of course, the above might be seen as excessively pedantic, since the additional standards and conventions active on the given platform might (and most certainly will) extend the applicability of
memset
, but I would still suggest following the Occam's razor principle here: don't rely on any other standards and conventions unless you really really have to. C++ language (as well a C) offers several language-level features that let you safely initialize your aggregate objects with proper zero values of proper type. Other answers already mentioned these features.这是“坏”的,因为你没有实现你的意图。
您的目的是将数组中的每个值设置为零,而您所编程的是将原始内存区域设置为零。是的,这两件事具有相同的效果,但简单地编写代码将每个元素归零会更清楚。
而且,它可能不会更有效率。
使用 Visual C++ 2008 32 位并打开优化来编译此循环,将循环编译为 -
这几乎正是 memset 可能编译为的内容。但是,如果您使用 memset,编译器就没有执行进一步优化的范围,而通过编写您的意图,编译器可能会执行进一步优化,例如注意到每个元素在使用之前都被设置为其他内容,因此初始化可以被优化,如果您使用了 memset,它可能无法轻松完成。
It is "bad" because you are not implementing your intent.
Your intent is to set each value in the array to zero and what you have programmed is setting an area of raw memory to zero. Yes, the two things have the same effect but it's clearer to simply write code to zero each element.
Also, it's likely no more efficient.
Compiling this with visual c++ 2008 32 bit with optimisations turned on compiles the loop to -
Which is pretty much exactly what the memset would likely compile to anyway. But if you use memset there is no scope for the compiler to perform further optimisations, whereas by writing your intent it's possible that the compiler could perform further optimisations, for example noticing that each element is later set to something else before it is used so the initialisation can be optimised out, which it likely couldn't do nearly as easily if you had used memset.
除了应用于类时不好之外,memset 还容易出错。很容易弄乱参数的顺序,或者忘记
sizeof
部分。代码通常会在编译时出现这些错误,并悄悄地做错误的事情。该错误的症状可能要很晚才会显现出来,因此很难追踪。memset
对于许多普通类型(例如指针和浮点)也存在问题。一些程序员将所有字节设置为 0,假设指针将为 NULL,浮点将为 0.0。这不是一个可移植的假设。In addition to badness when applied to classes,
memset
is also error prone. It's very easy to get the arguments out-of-order, or to forget thesizeof
portion. The code will usually compile with these errors, and quietly do the wrong thing. The symptom of the bug might not manifest until much later, making it difficult to track down.memset
is also problematic with lots of plain types, like pointers and floating point. Some programmers set all bytes to 0, assuming the pointers will then be NULL and floats will be 0.0. That's not a portable assumption.这是一条旧线程,但这里有一个有趣的转折:
效果非常好!
然而,
确实将虚拟值(即上面的 somefunc())设置为 NULL。
鉴于 memset 比将大类中的每个成员设置为 0 快得多,我多年来一直在执行上面的第一个 memset ,并且从未遇到过问题。
所以真正有趣的问题是它是如何运作的?我想编译器实际上开始将零设置为虚拟表之外的值...知道吗?
This is an OLD thread, but here's an interesting twist:
works PERFECTLY well!
However,
indeed sets the virtuals (i.e somefunc() above) to NULL.
Given that memset is drastically faster than setting to 0 each and every member in a large class, I've been doing the first memset above for ages and never had a problem.
So the really interesting question is how come it works? I suppose that the compiler actually starts to set the zero's BEYOND the virtual table... any idea?
从 C++ 11 开始,用零填充数组的最简单方法就是对其进行零初始化:
准确地说,我相信这实际上是使用空初始值设定项列表进行聚合初始化,这恰好对数组中的每个项目进行零初始化 -但重点是要表明这是可以做到的。
请参阅https://en.cppreference.com/w/cpp/language/zero_initialization 和 https://en.cppreference.com/w/cpp/language/aggregate_initialization 了解更多详情。
您还可以对容器对象本身进行零初始化,如果您有多个成员需要清零,这可能会更有效:
请参阅此处的现场演示。
这表示零-初始化是一个棘手的野兽 - 正如 CPPReference 页面所示,它没有专用语法,因此您必须小心编译器不会选择另一种类型的初始化。正如以下问题所示,让编译器对对象进行零初始化(包括填充)可能会很棘手:
C++ 标准是否保证初始化对于非静态聚合对象将字节填充为零?
考虑到所有这些,如果您正在编写具有目标平台集的高性能代码(例如在视频游戏行业中常见的情况) ,并且您知道将用零覆盖的整个对象的确切内存布局,如果仔细使用,memset 可能是一个很好的工具。它可以实现 POD 对象与 memcmp 的可靠且快速的比较等功能。
As of C++ 11, the simplest way to fill an array with zeros is to zero-initialize it:
To be precise, I believe this is actually aggregate initialization with an empty initializer list, which happens to zero-initialize every item in the array - but the main point is to show that it can be done.
See https://en.cppreference.com/w/cpp/language/zero_initialization and https://en.cppreference.com/w/cpp/language/aggregate_initialization for more details.
You could also zero-initialize the container object itself, which might be more efficient if you have several members to zero out:
See here for a live demo.
This said zero-initialization is a tricky beast - as the CPPReference page indicates, it has no dedicated syntax so you have to be careful that the compiler doesn't pick another type of initialization instead. And as the following question shows, getting compilers to zero-initialize your objects including padding can be tricky:
Does C++ standard guarantee the initialization of padding bytes to zero for non-static aggregate objects?
With all this in mind, if you're writing high-performance code with a targeted set of platforms in mind (something common for instance in the video games industry), and you know the exact memory layout of the whole object you'll be overwriting with zeroes, memset can be a good tool if used carefully. It can enable things like reliable and fast comparison of POD objects with memcmp.
你的代码没问题。我认为在 C++ 中,memset 唯一危险的时候是当你执行以下操作时:
YourClass 实例; memset(&instance, 0, sizeof(YourClass);
。我相信它可能会将编译器创建的实例中的内部数据清零。
Your code is fine. I thought the only time in C++ where memset is dangerous is when you do something along the lines of:
YourClass instance; memset(&instance, 0, sizeof(YourClass);
.I believe it might zero out internal data in your instance that the compiler created.
没有真正的理由不使用它,除了人们指出的少数情况,无论如何没有人会使用它,但使用它也没有真正的好处,除非你正在填充记忆卫士或其他东西。
There's no real reason to not use it except for the few cases people pointed out that no one would use anyway, but there's no real benefit to using it either unless you are filling memguards or something.
简短的答案是使用初始大小为 1024 的 std::vector。
“a”的所有元素的初始值为 0,作为 std::vector(size) 构造函数(以及 vector:: resize) 复制所有元素的默认构造函数的值。对于内置类型(也称为内在类型,或 POD),保证初始值为 0:
这将允许“a”使用的类型以最小的麻烦进行更改,甚至更改为类的类型。
大多数采用 void 指针 (void*) 作为参数的函数(例如 memset)都不是类型安全的。通过这种方式,忽略对象的类型会删除对象倾向于依赖的所有 C++ 风格语义,例如构造、销毁和复制。 memset 对类做出假设,这违反了抽象(不知道或不关心类内部的内容)。虽然这种违规行为并不总是立即显而易见,特别是对于内在类型,但它可能会导致难以定位错误,特别是随着代码库的增长和易手。如果 memset 的类型是带有 vtable(虚拟函数)的类,它也会覆盖该数据。
The short answer would be to use an std::vector with an initial size of 1024.
The initial value of all elements of "a" would be 0, as the std::vector(size) constructor (as well as vector::resize) copies the value of the default constructor for all elements. For built-in types (a.k.a. intrinsic types, or PODs), you are guaranteed the initial value to be 0:
This would allow the type that "a" uses to change with minimal fuss, even to that of a class.
Most functions that take a void pointer (void*) as a parameter, such as memset, are not type safe. Ignoring an object's type, in this way, removes all C++ style semantics objects tend to rely on, such as construction, destruction and copying. memset makes assumptions about a class, which violates abstraction (not knowing or caring what is inside a class). While this violation isn't always immediately obvious, especially with intrinsic types, it can potentially lead to hard to locate bugs, especially as the code base grows and changes hands. If the type that is memset is a class with a vtable (virtual functions) it will also overwrite that data.