将 std::string 转换为 C 函数的 char* 时要注意什么?
我读过很多帖子,询问如何将 C++ std::string
或 const std::string&
转换为 char*
将其传递给 C 函数,似乎在执行此操作时有很多警告。人们必须注意字符串是否连续以及许多其他事情。关键是我从来没有真正理解需要注意的所有要点,为什么?
我想知道是否有人可以总结一下从 std::string
转换为传递给 C 函数所需的 char*
的注意事项和缺点?
当 std::string
是一个 const
引用并且它只是一个非常量引用时,以及当 C 函数将更改 char* 当它不会改变它时。
I have read many posts asking the question on how to convert a C++ std::string
or const std::string&
to a char*
to pass it to a C function and it seems there is quite a few caveat's in regards to doing this. One has to beware about the string being contiguous and a lot of other things. The point is that I've never really understood all the points one needs to be aware of and why?
I wondered if someone could sum up the caveats and downfalls about doing a conversion from a std::string
to a char*
that is needed to pass to a C function?
This when the std::string
is a const
reference and when it's just a non-const reference, and when the C function will alter the char*
and when it will not alter it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
首先,无论 const 引用还是值都不会改变任何东西。
然后你必须考虑该函数的期望是什么。那里
函数可以使用
char*
或a
char const*
---memcpy 的原始版本,用于例如,使用了这些类型,并且可能仍然存在
这样的代码。希望这种情况很少见,在下面的内容中,
我假设 C 函数中的
char*
引用'\0'
终止的字符串。
如果 C 函数采用
char const*
,您可以将其传递给std::string::c_str() 的结果
;如果它需要一个char*
,它取决于。如果它需要一个
char*
只是因为它的日期来自C 的前
const
天,事实上,它没有修改任何内容,std::string::c_str()
后跟const_cast
是合适的。如果 C 函数使用
char*
作为输出然而,参数让事情变得更加困难。我个人
更喜欢声明一个 char[] 缓冲区,传递它,然后
将结果转换为
std::string
,但众所周知std::string
的实现使用连续的缓冲区,并且标准的下一版本将需要它,所以正确
首先确定
std::string
的尺寸(使用std::string::resize()
,然后传递&s[0]
,然后将字符串重新调整为结果长度(确定
也可以使用
strlen(s.c_str())
(如有必要)。最后(但这对于使用 C 程序来说也是一个问题
char[]
),您必须考虑任何生命周期问题。最多采用
char*
或char const*
的函数只需使用指针,算了,但是如果函数保存了指针
某处,供以后使用,字符串对象必须至少存在
尽可能长,并且在此期间不应修改其大小。
(同样,在这种情况下,我更喜欢使用
char[]
。)First, whether const reference or value doesn't change anything.
You then have to consider what the function is expecting. There
are different things which a function can do with a
char*
ora
char const*
---the original versions ofmemcpy
, forexample, used these types, and it's possible that there is still
such code around. It is, hopefully, rare, and in the following,
I will assume that the
char*
in the C function refer to'\0'
terminated strings.
If the C function takes a
char const*
, you can pass it theresults of
std::string::c_str()
; if it takes achar*
, itdepends. If it takes a
char*
simply because it dates from thepre-
const
days of C, and in fact, it modifies nothing,std::string::c_str()
followed by aconst_cast
isappropriate. If the C function is using the
char*
as an outparameter, however, things become more difficult. I personally
prefer declaring a
char[]
buffer, passing this, and thenconverting the results to
std::string
, but all knownimplementations of
std::string
use a contiguous buffer, andthe next version of the standard will require it, so correctly
dimensioning the
std::string
first (usingstd::string::resize()
, then passing&s[0]
, and afterwardsredimensionning the string to the resulting length (determined
using
strlen(s.c_str())
, if necessary) can also be used.Finally (but this is also an issue for C programs using
char[]
), you have to consider any lifetime issues. Mostfunctions taking
char*
orchar const*
simply use thepointer, and forget it, but if the function saves the pointer
somewhere, for later use, the string object must live at least
as long, and its size should not be modified during that period.
(Again, in such cases, I prefer using a
char[]
.)基本上,有三点很重要:
根据当前的标准,
std::string
实际上并不能保证使用连续存储(据我所知,这是由于改变)。但事实上,所有当前的实现都可能使用连续存储。 ,c_str()
(和data()
)实际上可能在内部创建字符串的副本……返回的指针仅当未调用原始字符串上的非常量方法时,
c_str()
(和data()
)的方法才有效。当 C 函数挂在指针上时(而不是仅在实际函数调用期间使用它),这使得它不适合使用。如果字符串有任何机会被修改,那么从
c_str()
中放弃常量并不是一个好主意。您必须使用字符串的副本创建一个缓冲区,并将其传递到 C 函数中。如果您创建一个缓冲区,请记住添加一个空终止符。Basically, there are three points that are important:
According to the still current standard,
std::string
isn’t actually guaranteed to use contiguous storage (as far as I know this is due to change). But in fact, all current implementations probably use contiguous storage anyway. For that reason,c_str()
(anddata()
) may actually create a copy of the string internally …The pointer returned by
c_str()
(anddata()
) is valid only as long as no non-const methods on the original string are invoked. This makes its use unsuitable when the C function hangs on to the pointer (as opposed to only using it during the duration of the actual function call).If there is any chance at all that the string is going to be modified, casting away constness from the
c_str()
is not a good idea. You must create a buffer with a copy of the string, and pass that into the C function. If you create a buffer, remember to add a null termination.[我想添加一条评论,但我没有足够的代表,所以很抱歉添加(还)另一个答案。]
虽然当前标准确实不能保证内部缓冲区std::string 是连续的,看来几乎所有实现都使用连续的缓冲区。此外,新的 C++0x 标准(即将获得 ISO 批准)要求 std::string 中有连续的内部缓冲区,甚至当前的 C++03 标准也要求在调用 data() 或&str[0] (尽管它不一定以空终止)。请参阅此处 了解更多详情。
但这仍然不能保证写入字符串的安全,因为当您调用 data()、c_str() 或运算符时,标准并不强制实现实际返回其内部缓冲区,并且它们也不会被阻止使用诸如写时复制之类的优化,这可能会使事情进一步复杂化(新的 C++0x 似乎将禁止写时复制)。话虽这么说,如果您不关心最大可移植性,您可以检查您的目标实现并查看它的内部实际功能。 AFAIK,Visual C++ 2008/2010 总是返回真正的内部缓冲区指针,并且不执行写时复制(它确实有小字符串优化,但这可能不是问题)。
[I would add a comment, but I don't have enough rep for that, so sorry for adding (yet) another answer.]
While it is true that the current standard does not guarantee the internal buffer of std::string to be contiguous, it appears that practically all implementations use contiguous buffers. Furthermore, the new C++0x standard (which is about to be approved by ISO) requires contiguous internal buffers in std::string, and even the current C++03 standard requires returning a contiguous buffer when you call data() or &str[0] (though it won't be necessarily null-terminated). See here for more details.
That still doesn't make it safe to write to the string though, since the standard doesn't force implementations to actually return their internal buffer when you call data(), c_str() or operator, and neither are they prevented from using optimizations like copy-on-write, which may complicate things further (it appears that the new C++0x will ban ban copy-on-write though). That being said, if you don't care about maximum portability, you can check your target implementation and see what it actually does inside. AFAIK, Visual C++ 2008/2010 always returns the real internal buffer pointer, and doesn't do copy-on-write (it does have the Small String Optimization, but that's probably not a concern).
当 C 函数不改变
char*
后面的字符串时,您可以将std::string::c_str()
用于 const和非常量std::string
实例。理想情况下,它是一个const char*
,但如果不是(由于遗留 API),您可以合法地使用const_cast
。但只要不修改字符串,您就只能使用
c_str()
中的指针!当 C 函数确实更改了
char*
后面的字符串时,使用std::string
的唯一安全且可移植的方法是将其复制到临时缓冲区(例如来自c_str()
)!确保之后释放临时内存 - 或使用std::vector
,这样可以保证拥有连续的内存。When the C function does not alter the string behind the
char*
, you can usestd::string::c_str()
for both const and non-conststd::string
instances. Ideally it would be aconst char*
, but if it's not (because of a legacy API) you may legally use aconst_cast
.But you may only use the pointer from
c_str()
as long as you're not modifying the string!When the C function does alter the string behind the
char*
, your only safe and portable way to use thestd::string
is to copy it to a temporary buffer yourself (for example fromc_str()
)! Make sure you free the temporary memory afterwards -- or usestd::vector
, which is guaranteed to have continuous memory.std:string 可以存储零字节。这意味着当传递给 C 函数时,它可能会被提前截断,因为 C 函数将在第一个零字节处停止。例如,如果您尝试使用 C 函数来过滤或转义不需要的字符,这可能会产生安全隐患。
std::string::c_str() 的结果有时会因更改字符串的操作(非常量成员函数)而失效。如果您在第一次使用 c_str() 然后修改字符串后尝试使用此指针,将导致很难诊断错误(“Heisenbugs”)。
永远不要使用
const_cast
。goto
不太麻烦。std:string can store zero bytes. This means that when passed to C function it can be truncated prematurely, as C functions will stop on first zero byte. This can have security implications, if you try to use C function for example to filter out or escape unwanted characters.
A result of std::string::c_str() will sometimes be invalidated by operations changing a string (non-const member functions). It will cause very hard to diagnose bugs ("Heisenbugs") if you try to use this pointer after you first use c_str() and then modify a string.
Do not use
const_cast
, ever.goto
is less troublesome.