用 C 字符串初始化 std::string 的奇怪方法
当我阅读 nVidia CUDA 源代码时,我偶然发现了这两行:
std::string stdDevString;
stdDevString = std::string(device_string);
请注意,device_string 是一个 char[1024]。问题是:为什么构造一个空的 std::string,然后用 C 字符串作为参数再次构造它?他们为什么不在一行中调用 std::string stdDevString = std::string(device_string); ?
此代码是否试图逃避/使用隐藏的字符串初始化行为?是为了确保 stdDevString 内的 C 字符串无论如何都保持以 null 结尾?因为据我所知,将 std::string 初始化为非 null 终止的 C 字符串仍然会出现问题。
while I was reading nVidia CUDA source code, I stumbled upon these two lines:
std::string stdDevString;
stdDevString = std::string(device_string);
Note that device_string is a char[1024]. The question is: Why construct an empty std::string, then construct it again with a C string as an argument? Why didn't they call std::string stdDevString = std::string(device_string);
in just one line?
Is there a hidden string initialization behavior that this code tries to evade/use? Is to ensure that the C string inside stdDevString remains null terminated no matter what? Because as far as I know, initializing an std::string to a C string that's not null terminated will still exhibit problems.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
他们的所作所为没有充分的理由。给定 std::string::string(const char*) 构造函数,您可以简单地使用以下任何一个:
两步默认构造然后赋值只是(糟糕的)程序员风格或疏忽。如果没有优化,它确实做了一些不必要的构建,但这仍然相当便宜。它可能已通过优化删除。不是什么大事 - 我怀疑我是否愿意在代码审查中提及它,除非它处于对性能极其敏感的区域,但绝对最好推迟声明变量,直到有一个有用的初始值可用于构造它们,并将其本地化一切都集中在一处:它不仅不易出错且可交叉引用,而且最大限度地减少了变量的范围,简化了其使用的推理。
不——这没有什么区别。自 C++11 起,无论使用哪个构造函数,
stdDevString
中的内部缓冲区都将保持 NUL 终止,而对于 C++03 不一定终止 - 请参阅 C++03 的专用标题详细信息如下 - 但无论构建/分配如何完成,都不能保证。你是对的 - 你列出的任何构造选项都只会将 ASCIIZ 文本复制到
std::string
中 - 考虑到第一个 NUL ('\0'
)终结者。如果 char 数组不是以 NUL 结尾,则会出现问题。(这是一个单独的问题,即
std::string
内的缓冲区是否保持 NUL 终止 - 如上所述)。请注意,有一个单独的
string(const char*, size_type)
构造函数,它可以创建嵌入 NUL 的字符串,并且不会尝试读取比告诉的更进一步(构造函数(4)此处)C++03 std ::strings 不保证内部以 NUL 结尾
无论以何种方式构造和初始化
std::string
,在 C++11 之前 标准不要求它在字符串缓冲区内以 NUL 结尾。std::string
最好被想象为包含一堆潜在的不可打印(宽松地说,是 ftp/文件 I/O 意义上的二进制)字符,从地址data() 并扩展
size()
字符。因此,如果您有:请注意,std::string API 要求 c_str() 返回指向 NUL 终止值的指针。为此,它可以:
NUL
(在这种情况下,data[5]
会发生 为了保证该实现的安全,但是如果实现发生更改或代码被移植到另一个标准库实现等,代码可能会中断。)反应性地等待
c_str()
调用,然后:data()
),则附加一个NUL
并返回与data()
将返回NUL
终止它,并返回指向它的指针(通常但可选此缓冲区将替换旧缓冲区将被删除,这样之后立即调用data()
将返回与c_str()
返回的相同指针)No good reason for what they did. Given the
std::string::string(const char*)
constructor, you can simply use any of:The two-step default construction then assignment is just (bad) programmer style or oversight. Sans optimisation, it does do a little unnecessary construction, but that's still pretty cheap. It's likely removed by optimisation. Not a biggie - I doubt if I'd bother to mention it in a code review unless it was in an extremely performance sensitive area, but it's definitely best to defer declaring variables until a useful initial value is available to construct them with, localising it all in one place: not only is it less error prone and cross-referenceable, but it minimises the scope of the variable simplifying the reasoning about its use.
No - it made no difference to that. Since C++11 the internal buffer in
stdDevString
would be kept NUL terminated regardless of which constructor is used, while for C++03 isn't not necessarily terminated - see dedicated heading for C++03 details below - but there's no guarantees regardless of how construction / assignment is done.You're right - any of the construction options you've listed will only copy ASCIIZ text into the
std::string
- considering the first NUL ('\0'
) the terminator. If the char array isn't NUL-terminated there will be problems.(That's a separate issue to whether the buffer inside the
std::string
is kept NUL terminated - discussed above).Note that there's a separate
string(const char*, size_type)
constructor that can create strings with embedded NULs, and won't try to read further than told (Constructor (4) here)C++03 std::strings were not guaranteed NUL-terminated internally
Whichever way the
std::string
is constructed and initialised, before C++11 the Standard did not require it to be NUL-terminated within the string's buffer.std::string
was best imagined as containing a bunch of potentially non-printable (loosely speaking, binary in the ftp/file I/O sense) characters starting at addressdata()
and extending forsize()
characters. So, if you had:Note that the std::string API requires
c_str()
to return a pointer to a NUL-terminated value. To do so, it can either:NUL
on the end of the string buffer at all times (in which casedata[5]
would happen to be safe on that implementation, but the code could break if the implementation changed or the code was ported to another Standard library implementation etc.)reactively wait until
c_str()
is called, then:data()
), append aNUL
and return the same pointer value thatdata()
would returnNUL
terminate it, and return a pointer to it (typically but optionally this buffer would replace the old buffer which would be deleted, such that callingdata()
immediately afterwards would return the same pointer returned byc_str()
)我想说这相当于写:
或者,更简单:
一旦创建了 std::string,它就包含 C 字符串中数据的私有副本。
I would say that it's equivalent of writing:
Or, even simpler:
Once the std::string has been created, it contains a private copy of the data in the C string.
我认为将其视为糟糕的编码是无知的。如果我们假设该字符串是在文件范围内分配的或作为静态变量分配的,那么它可能是良好的编码。
当为存在非易失性存储器的嵌入式系统进行 C++ 编程时,您希望避免静态初始化的原因有很多:主要原因是它在程序开头添加了大量开销代码,所有此类变量都需要初始化。如果它们是类的实例,则将调用构造函数。
这将导致程序执行开始时出现延迟峰值。您不希望出现这个工作负载峰值,因为启动程序时还有更多重要的任务要做,例如设置各种硬件。
为了避免这种情况,您通常会在编译器中启用一个选项来删除此类静态初始化,然后以不初始化静态/全局变量的方式编写代码,而是在运行时设置它们。
在这样的系统上,OP 发布的代码是正确的方法。
I think it is ignorant to dismiss this as poor coding. If we assume that this string was allocated at file scope or as a static variable, it could be good coding.
When programming C++ for embedded systems with non-volatile memory present, there are many reasons why you wish to avoid static initialization: the main reason is that it adds lots of overhead code in the beginning of the program, where all such variables much be initialized. If they are instances of classes, constructors will be called.
This will lead to a delay peak at the beginning of the program execution. You don't want this workload peak there, because there are much more important tasks to do when starting up the program, like setting up various hardware.
To avoid this, you typically enable an option in the compiler which removes such static initialization, and then write your code in such a manner that no static/global variables are initialized, but instead set them in runtime.
On such a system, the code posted by the OP is the correct way to do it.
对我来说看起来像是一件人工制品。也许中间还有一些其他代码,然后它被删除了,有人懒得将剩下的两行合并成一行。
Looks like an artefact to me. Perhaps there was some other code in between, then it got removed, and someone was too lazy to join those two remaining lines into a single one.