如何将 char* 转换为 wchar_t*?
我尝试过实现这样的函数,但不幸的是它不起作用:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
我的主要目标是能够将普通的字符字符串集成到 Unicode 应用程序中。非常感谢你们提供的任何建议。
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
在您的示例中,
wc
是一个局部变量,函数调用结束时将释放该变量。这会让你进入未定义的行为领域。简单的修复是这样的:
请注意,调用代码必须释放该内存,否则将出现内存泄漏。
In your example,
wc
is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.The simple fix is this:
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
使用
std::wstring
而不是 C99 可变长度数组。当前标准保证 std::basic_string 具有连续的缓冲区。例如,C++ 不支持 C99 可变长度数组,因此如果您将代码编译为纯 C++,它甚至无法编译。
进行此更改后,您的函数返回类型也应为
std::wstring
。请记住在
main
中设置相关区域设置。例如,
setlocale( LC_ALL, "" )
。Use a
std::wstring
instead of a C99 variable length array. The current standard guarantees a contiguous buffer forstd::basic_string
. E.g.,C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be
std::wstring
.Remember to set relevant locale in
main
.E.g.,
setlocale( LC_ALL, "" )
.使用“mbstowcs”的示例
使用“mbstowcs_s”的示例
Microsoft 建议使用“mbstowcs_s”而不是“mbstowcs”。
链接:
Mbstowcs 示例
mbstowcs_s, _mbstowcs_s_l
Example of usage "mbstowcs"
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
您将返回在堆栈上分配的局部变量的地址。当函数返回时,所有局部变量(例如
wc
)的存储空间都会被释放,并且可能会立即被其他变量覆盖。要解决此问题,您可以将缓冲区的大小传递给
GetWC
,但这样您就获得了与mbstowcs
本身几乎相同的接口。或者,您可以在 GetWC 内分配一个新缓冲区并返回指向该缓冲区的指针,然后由调用者来释放该缓冲区。You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as
wc
) is deallocated and is subject to being immediately overwritten by something else.To fix this, you can pass the size of the buffer to
GetWC
, but then you've got pretty much the same interface asmbstowcs
itself. Or, you could allocate a new buffer insideGetWC
and return a pointer to that, leaving it up to the caller to deallocate the buffer.我做了这样的事情。前 2 个零是因为我不知道该命令需要我提供什么样的 ascii 类型的东西。我的总体感觉是创建一个临时字符数组。传入宽字符数组。繁荣。有用。 +1 确保空终止字符位于正确的位置。
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
安德鲁·谢泼德(Andrew Shepherd)的回答对我来说很好,我添加了一些修复:
1、去掉结尾字符L'\0',以免有时会出现问题。
2、使用mbstowcs_s
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
这个问题有几个问题,但一些答案也有问题。返回指向已分配内存的指针“并将其留给调用者来取消分配”的想法是自找麻烦。通常,最佳模式始终是在同一函数内分配和取消分配。例如,如下所示:
一般来说,这需要两个函数,第一个函数由调用者调用以找出要分配多少内存,第二个函数用于初始化或填充分配的内存。
不幸的是,使用函数返回“新”对象的基本思想是有问题的——不是本质上的,而是因为 C++ 继承了 C 内存处理。使用C++和STL的strings/wstrings/strstreams是一个更好的解决方案,但我觉得内存分配问题需要更好地解决。
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
您的问题与编码无关,这是理解基本 C++ 的简单问题。您从函数中返回一个指向局部变量的指针,当任何人都可以使用它时,该变量将超出范围,从而创建未定义的行为(即编程错误)。
遵循这个黄金法则:“如果您使用裸字符指针,那么您就做错了。(除非您没有使用。)”
我已经 之前发布一些代码来进行转换并在中传递输入和输出C++
std::string
和std::wstring
对象。Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++
std::string
andstd::wstring
objects.