C++将 string(或 char*)转换为 wstring(或 wchar_t*)

发布于 2024-08-27 10:29:03 字数 141 浏览 12 评论 0原文

string s = "おはよう";
wstring ws = FUNCTION(s, ws);

我如何将 s 的内容分配给 ws ?

搜索谷歌并使用了一些技术,但他们无法分配确切的内容。内容被扭曲。

string s = "おはよう";
wstring ws = FUNCTION(s, ws);

How would i assign the contents of s to ws?

Searched google and used some techniques but they can't assign the exact content. The content is distorted.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(20

樱娆 2024-09-03 10:29:04

注意!请参阅底部的注意 (2023-10-05)

假设示例中的输入字符串 (おはよう) 是 UTF-8 编码的(从外观来看,它不是,但为了解释起见,我们假设它是:-))Unicode 字符串的表示形式如果您感兴趣,那么您的问题可以仅使用标准库(C++11 及更高版本)来完全解决。

TL;DR版本:

#include <locale>
#include <codecvt>
#include <string>

std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::string narrow = converter.to_bytes(wide_utf16_source_string);
std::wstring wide = converter.from_bytes(narrow_utf8_source_string);

更长的在线编译和运行示例:

(它们都显示相同的示例。只是有很多冗余......)

注意(旧)

正如评论中指出的那样,并在 https://stackoverflow.com/ 中进行了解释a/17106065/6345 在某些情况下,使用标准库在 UTF-8 和 UTF-16 之间进行转换时,可能会在不同平台上产生意外的结果差异。为了获得更好的转换,请考虑使用 http 中所述的 std::codecvt_utf8 ://en.cppreference.com/w/cpp/locale/codecvt_utf8

注意(新)

由于 codecvt 标头在 C++17 中已弃用,有人对此答案中提出的解决方案提出了一些担忧。然而,C++ 标准委员会在 中添加了一条重要声明http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0618r0.html

该库组件应与附件 D 一起退役,直到标准化合适的替代品为止。

所以在可预见的未来,本答案中的codecvt解决方案是安全且可移植的。

注意 (2023-10-05)

建议删除 C++26 中已弃用的 codecvtwstring_convert

NOTE! See Note (2023-10-05) at the bottom!

Assuming that the input string in your example (おはよう) is a UTF-8 encoded (which it isn't, by the looks of it, but let's assume it is for the sake of this explanation :-)) representation of a Unicode string of your interest, then your problem can be fully solved with the standard library (C++11 and newer) alone.

The TL;DR version:

#include <locale>
#include <codecvt>
#include <string>

std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::string narrow = converter.to_bytes(wide_utf16_source_string);
std::wstring wide = converter.from_bytes(narrow_utf8_source_string);

Longer online compilable and runnable example:

(They all show the same example. There are just many for redundancy...)

Note (old):

As pointed out in the comments and explained in https://stackoverflow.com/a/17106065/6345 there are cases when using the standard library to convert between UTF-8 and UTF-16 might give unexpected differences in the results on different platforms. For a better conversion, consider std::codecvt_utf8 as described on http://en.cppreference.com/w/cpp/locale/codecvt_utf8

Note (new):

Since the codecvt header is deprecated in C++17, some worry about the solution presented in this answer were raised. However, the C++ standards committee added an important statement in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0618r0.html saying

this library component should be retired to Annex D, along side , until a suitable replacement is standardized.

So in the foreseeable future, the codecvt solution in this answer is safe and portable.

Note (2023-10-05):

Proposal to remove the deprecated codecvt and wstring_convert in C++26:

时光是把杀猪刀 2024-09-03 10:29:04
int StringToWString(std::wstring &ws, const std::string &s)
{
    std::wstring wsTmp(s.begin(), s.end());

    ws = wsTmp;

    return 0;
}
int StringToWString(std::wstring &ws, const std::string &s)
{
    std::wstring wsTmp(s.begin(), s.end());

    ws = wsTmp;

    return 0;
}
掩耳倾听 2024-09-03 10:29:04

你的问题没有具体说明。严格来说,这个例子是一个语法错误。但是, std::mbstowcs 是可能就是您正在寻找的。

它是一个 C 库函数,在缓冲区上运行,但这里有一个易于使用的习惯用法,由 Mooing Duck 提供:

std::wstring ws(s.size(), L' '); // Overestimate number of code points.
ws.resize(std::mbstowcs(&ws[0], s.c_str(), s.size())); // Shrink to fit.

Your question is underspecified. Strictly, that example is a syntax error. However, std::mbstowcs is probably what you're looking for.

It is a C-library function and operates on buffers, but here's an easy-to-use idiom, courtesy of Mooing Duck:

std::wstring ws(s.size(), L' '); // Overestimate number of code points.
ws.resize(std::mbstowcs(&ws[0], s.c_str(), s.size())); // Shrink to fit.
因为看清所以看轻 2024-09-03 10:29:04

如果您使用 Windows/Visual Studio 并且需要将字符串转换为 wstring,您可以使用:

#include <AtlBase.h>
#include <atlconv.h>
...
string s = "some string";
CA2W ca2w(s.c_str());
wstring w = ca2w;
printf("%s = %ls", s.c_str(), w.c_str());

将 wstring 转换为字符串的相同过程(有时您需要指定一个代码页):

#include <AtlBase.h>
#include <atlconv.h>
...
wstring w = L"some wstring";
CW2A cw2a(w.c_str());
string s = cw2a;
printf("%s = %ls", s.c_str(), w.c_str());

您可以指定代码页,甚至UTF8(这在使用JNI/Java时非常好) 。此答案中显示了将 std::wstring 转换为 utf8 std::string 的标准方式。

// 
// using ATL
CA2W ca2w(str, CP_UTF8);

// 
// or the standard way taken from the answer above
#include <codecvt>
#include <string>

// convert UTF-8 string to wstring
std::wstring utf8_to_wstring (const std::string& str) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
    return myconv.from_bytes(str);
}

// convert wstring to UTF-8 string
std::string wstring_to_utf8 (const std::wstring& str) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
    return myconv.to_bytes(str);
}

如果您想了解有关代码页的更多信息,请参阅 Joel 有关软件的有趣文章:每个软件开发人员绝对必须了解 Unicode 和字符集的绝对最低限度

这些 CA2W(将 Ansi 转换为 Wide=unicode)宏是 ATL 和 MFC 字符串转换的一部分宏,包括示例。

有时您需要禁用安全警告#4995',我不知道其他解决方法(对我来说,当我在 VS2012 中为 WindowsXp 编译时会发生这种情况)。

#pragma warning(push)
#pragma warning(disable: 4995)
#include <AtlBase.h>
#include <atlconv.h>
#pragma warning(pop)

编辑:
好吧,根据 这篇文章,Joel 的文章似乎是:“虽然很有趣,但它对实际情况的了解相当少技术细节”。文章:每个程序员绝对需要了解处理文本的编码和字符集

If you are using Windows/Visual Studio and need to convert a string to wstring you could use:

#include <AtlBase.h>
#include <atlconv.h>
...
string s = "some string";
CA2W ca2w(s.c_str());
wstring w = ca2w;
printf("%s = %ls", s.c_str(), w.c_str());

Same procedure for converting a wstring to string (sometimes you will need to specify a codepage):

#include <AtlBase.h>
#include <atlconv.h>
...
wstring w = L"some wstring";
CW2A cw2a(w.c_str());
string s = cw2a;
printf("%s = %ls", s.c_str(), w.c_str());

You could specify a codepage and even UTF8 (that's pretty nice when working with JNI/Java). A standard way of converting a std::wstring to utf8 std::string is showed in this answer.

// 
// using ATL
CA2W ca2w(str, CP_UTF8);

// 
// or the standard way taken from the answer above
#include <codecvt>
#include <string>

// convert UTF-8 string to wstring
std::wstring utf8_to_wstring (const std::string& str) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
    return myconv.from_bytes(str);
}

// convert wstring to UTF-8 string
std::string wstring_to_utf8 (const std::wstring& str) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>> myconv;
    return myconv.to_bytes(str);
}

If you want to know more about codepages there is an interesting article on Joel on Software: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets.

These CA2W (Convert Ansi to Wide=unicode) macros are part of ATL and MFC String Conversion Macros, samples included.

Sometimes you will need to disable the security warning #4995', I don't know of other workaround (to me it happen when I compiled for WindowsXp in VS2012).

#pragma warning(push)
#pragma warning(disable: 4995)
#include <AtlBase.h>
#include <atlconv.h>
#pragma warning(pop)

Edit:
Well, according to this article the article by Joel appears to be: "while entertaining, it is pretty light on actual technical details". Article: What Every Programmer Absolutely, Positively Needs To Know About Encoding And Character Sets To Work With Text.

你在我安 2024-09-03 10:29:04

仅限 Windows API,C++11 之前的实现,以防有人需要:

#include <stdexcept>
#include <vector>
#include <windows.h>

using std::runtime_error;
using std::string;
using std::vector;
using std::wstring;

wstring utf8toUtf16(const string & str)
{
   if (str.empty())
      return wstring();

   size_t charsNeeded = ::MultiByteToWideChar(CP_UTF8, 0, 
      str.data(), (int)str.size(), NULL, 0);
   if (charsNeeded == 0)
      throw runtime_error("Failed converting UTF-8 string to UTF-16");

   vector<wchar_t> buffer(charsNeeded);
   int charsConverted = ::MultiByteToWideChar(CP_UTF8, 0, 
      str.data(), (int)str.size(), &buffer[0], buffer.size());
   if (charsConverted == 0)
      throw runtime_error("Failed converting UTF-8 string to UTF-16");

   return wstring(&buffer[0], charsConverted);
}

Windows API only, pre C++11 implementation, in case someone needs it:

#include <stdexcept>
#include <vector>
#include <windows.h>

using std::runtime_error;
using std::string;
using std::vector;
using std::wstring;

wstring utf8toUtf16(const string & str)
{
   if (str.empty())
      return wstring();

   size_t charsNeeded = ::MultiByteToWideChar(CP_UTF8, 0, 
      str.data(), (int)str.size(), NULL, 0);
   if (charsNeeded == 0)
      throw runtime_error("Failed converting UTF-8 string to UTF-16");

   vector<wchar_t> buffer(charsNeeded);
   int charsConverted = ::MultiByteToWideChar(CP_UTF8, 0, 
      str.data(), (int)str.size(), &buffer[0], buffer.size());
   if (charsConverted == 0)
      throw runtime_error("Failed converting UTF-8 string to UTF-16");

   return wstring(&buffer[0], charsConverted);
}
最近可好 2024-09-03 10:29:04

下面是将 stringwstring 和混合字符串常量组合到 wstring 的方法。使用 wstringstream 类。

这不适用于多字节字符编码。这只是一种愚蠢的方式,抛弃了类型安全性,并将 std::string 中的 7 位字符扩展为 std::wstring 每个字符的低 7 位。仅当您有 7 位 ASCII 字符串并且需要调用需要宽字符串的 API 时,这才有用。

#include <sstream>

std::string narrow = "narrow";
std::wstring wide = L"wide";

std::wstringstream cls;
cls << " abc " << narrow.c_str() << L" def " << wide.c_str();
std::wstring total= cls.str();

Here's a way to combining string, wstring and mixed string constants to wstring. Use the wstringstream class.

This does NOT work for multi-byte character encodings. This is just a dumb way of throwing away type safety and expanding 7 bit characters from std::string into the lower 7 bits of each character of std:wstring. This is only useful if you have a 7-bit ASCII strings and you need to call an API that requires wide strings.

#include <sstream>

std::string narrow = "narrow";
std::wstring wide = L"wide";

std::wstringstream cls;
cls << " abc " << narrow.c_str() << L" def " << wide.c_str();
std::wstring total= cls.str();
何以畏孤独 2024-09-03 10:29:04

char*wstring

char* str = "hello worlddd";
wstring wstr (str, str+strlen(str));

stringwstring

string str = "hello worlddd";
wstring wstr (str.begin(), str.end());

请注意,只有当要转换的字符串包含仅限 ASCII 字符。

From char* to wstring:

char* str = "hello worlddd";
wstring wstr (str, str+strlen(str));

From string to wstring:

string str = "hello worlddd";
wstring wstr (str.begin(), str.end());

Note this only works well if the string being converted contains only ASCII characters.

一念一轮回 2024-09-03 10:29:04

它的这个变体是我在现实生活中最喜欢的。它将输入(如果有效)UTF-8 转换为相应的 wstring。如果输入被损坏,wstring 就会由单个字节构造而成。如果您无法真正确定输入数据的质量,这将非常有用。

std::wstring convert(const std::string& input)
{
    try
    {
        std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
        return converter.from_bytes(input);
    }
    catch(std::range_error& e)
    {
        size_t length = input.length();
        std::wstring result;
        result.reserve(length);
        for(size_t i = 0; i < length; i++)
        {
            result.push_back(input[i] & 0xFF);
        }
        return result;
    }
}

This variant of it is my favourite in real life. It converts the input, if it is valid UTF-8, to the respective wstring. If the input is corrupted, the wstring is constructed out of the single bytes. This is extremely helpful if you cannot really be sure about the quality of your input data.

std::wstring convert(const std::string& input)
{
    try
    {
        std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
        return converter.from_bytes(input);
    }
    catch(std::range_error& e)
    {
        size_t length = input.length();
        std::wstring result;
        result.reserve(length);
        for(size_t i = 0; i < length; i++)
        {
            result.push_back(input[i] & 0xFF);
        }
        return result;
    }
}
谜泪 2024-09-03 10:29:04

使用 Boost.Locale:

ws = boost::locale::conv::utf_to_utf<wchar_t>(s);

using Boost.Locale:

ws = boost::locale::conv::utf_to_utf<wchar_t>(s);
爱的十字路口 2024-09-03 10:29:04

您可以使用 boost 路径或 std 路径;这要容易得多。
则 boost 路径对于跨平台应用程序来说更容易。

#include <boost/filesystem/path.hpp>

namespace fs = boost::filesystem;

//s to w
std::string s = "xxx";
auto w = fs::path(s).wstring();

//w to s
std::wstring w = L"xxx";
auto s = fs::path(w).string();

如果您喜欢使用 std:

#include <filesystem>
namespace fs = std::filesystem;

//The same

c++ 旧版本,

#include <experimental/filesystem>
namespace fs = std::experimental::filesystem;

//The same

其中的代码仍然实现一个转换器,您不必解开细节。

You can use boost path or std path; which is a lot more easier.
boost path is easier for cross-platform application

#include <boost/filesystem/path.hpp>

namespace fs = boost::filesystem;

//s to w
std::string s = "xxx";
auto w = fs::path(s).wstring();

//w to s
std::wstring w = L"xxx";
auto s = fs::path(w).string();

if you like to use std:

#include <filesystem>
namespace fs = std::filesystem;

//The same

c++ older version

#include <experimental/filesystem>
namespace fs = std::experimental::filesystem;

//The same

The code within still implement a converter which you dont have to unravel the detail.

醉梦枕江山 2024-09-03 10:29:04

对我来说,没有大开销的最简单的选项是:

包含:

#include <atlbase.h>
#include <atlconv.h>

转换:

char* whatever = "test1234";
std::wstring lwhatever = std::wstring(CA2W(std::string(whatever).c_str()));

如果需要:

lwhatever.c_str();

For me the most uncomplicated option without big overhead is:

Include:

#include <atlbase.h>
#include <atlconv.h>

Convert:

char* whatever = "test1234";
std::wstring lwhatever = std::wstring(CA2W(std::string(whatever).c_str()));

If needed:

lwhatever.c_str();
百善笑为先 2024-09-03 10:29:04

字符串到 wstring

std::wstring Str2Wstr(const std::string& str)
{
    int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
    std::wstring wstrTo(size_needed, 0);
    MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
    return wstrTo;
}

wstring 到字符串

std::string Wstr2Str(const std::wstring& wstr)
{
    typedef std::codecvt_utf8<wchar_t> convert_typeX;
    std::wstring_convert<convert_typeX, wchar_t> converterX;
    return converterX.to_bytes(wstr);
}

String to wstring

std::wstring Str2Wstr(const std::string& str)
{
    int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0);
    std::wstring wstrTo(size_needed, 0);
    MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed);
    return wstrTo;
}

wstring to String

std::string Wstr2Str(const std::wstring& wstr)
{
    typedef std::codecvt_utf8<wchar_t> convert_typeX;
    std::wstring_convert<convert_typeX, wchar_t> converterX;
    return converterX.to_bytes(wstr);
}
分開簡單 2024-09-03 10:29:04

如果你有QT并且你懒得实现一个功能和你可以使用的东西

std::string str;
QString(str).toStdWString()

If you have QT and if you are lazy to implement a function and stuff you can use

std::string str;
QString(str).toStdWString()
他夏了夏天 2024-09-03 10:29:04

这是我的超级基本解决方案,可能并不适合所有人。但对很多人都有效。

它需要使用指南支持库。
这是一个相当官方的 C++ 库,由许多 C++ 委员会作者设计:

    std::string to_string(std::wstring const & wStr)
    {
        std::string temp = {};

        for (wchar_t const & wCh : wStr)
        {
            // If the string can't be converted gsl::narrow will throw
            temp.push_back(gsl::narrow<char>(wCh));
        }

        return temp;
    }

我的所有功能都允许如果可能的话进行转换。否则抛出异常。

通过使用 gsl::narrow (https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es49-if-you-must-use-a-cast-use-a -命名演员

Here is my super basic solution that might not work for everyone. But would work for a lot of people.

It requires usage of the Guideline Support Library.
Which is a pretty official C++ library that was designed by many C++ committee authors:

    std::string to_string(std::wstring const & wStr)
    {
        std::string temp = {};

        for (wchar_t const & wCh : wStr)
        {
            // If the string can't be converted gsl::narrow will throw
            temp.push_back(gsl::narrow<char>(wCh));
        }

        return temp;
    }

All my function does is allow the conversion if possible. Otherwise throw an exception.

Via the usage of gsl::narrow (https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es49-if-you-must-use-a-cast-use-a-named-cast)

断桥再见 2024-09-03 10:29:04

s2ws 方法效果很好。希望有帮助。

std::wstring s2ws(const std::string& s) {
    std::string curLocale = setlocale(LC_ALL, ""); 
    const char* _Source = s.c_str();
    size_t _Dsize = mbstowcs(NULL, _Source, 0) + 1;
    wchar_t *_Dest = new wchar_t[_Dsize];
    wmemset(_Dest, 0, _Dsize);
    mbstowcs(_Dest,_Source,_Dsize);
    std::wstring result = _Dest;
    delete []_Dest;
    setlocale(LC_ALL, curLocale.c_str());
    return result;
}

method s2ws works well. Hope helps.

std::wstring s2ws(const std::string& s) {
    std::string curLocale = setlocale(LC_ALL, ""); 
    const char* _Source = s.c_str();
    size_t _Dsize = mbstowcs(NULL, _Source, 0) + 1;
    wchar_t *_Dest = new wchar_t[_Dsize];
    wmemset(_Dest, 0, _Dsize);
    mbstowcs(_Dest,_Source,_Dsize);
    std::wstring result = _Dest;
    delete []_Dest;
    setlocale(LC_ALL, curLocale.c_str());
    return result;
}
清风无影 2024-09-03 10:29:04

根据我自己的测试(在 Windows 8、vs2010 上),mbstowcs 实际上会损坏原始字符串,它仅适用于 ANSI 代码页。如果 MultiByteToWideChar/WideCharToMultiByte 也可能导致字符串损坏 - 但它们倾向于用“?”替换它们不知道的字符问号,但 mbstowcs 在遇到未知字符时往往会停止并在此时剪切字符串。 (我已经在芬兰窗口上测试了越南语字符)。

因此,与模拟 ansi C 函数相比,更喜欢 Multi*-windows api 函数。

另外,我注意到将字符串从一个代码页编码到另一个代码页的最短方法不是使用 MultiByteToWideChar/WideCharToMultiByte api 函数调用,而是使用它们的模拟 ATL 宏:W2A / A2W。

所以上面提到的模拟函数听起来像:

wstring utf8toUtf16(const string & str)
{
   USES_CONVERSION;
   _acp = CP_UTF8;
   return A2W( str.c_str() );
}

_acp 在 USES_CONVERSION 宏中声明。

或者也是我在执行旧数据转换为新数据时经常错过的函数:

string ansi2utf8( const string& s )
{
   USES_CONVERSION;
   _acp = CP_ACP;
   wchar_t* pw = A2W( s.c_str() );

   _acp = CP_UTF8;
   return W2A( pw );
}

但请注意,这些宏大量使用堆栈 - 不要对同一函数使用 for 循环或递归循环 - 使用 W2A 或 A2W 宏之后 - 最好尽快返回,因此堆栈将免于临时转换。

Based upon my own testing (On windows 8, vs2010) mbstowcs can actually damage original string, it works only with ANSI code page. If MultiByteToWideChar/WideCharToMultiByte can also cause string corruption - but they tends to replace characters which they don't know with '?' question marks, but mbstowcs tends to stop when it encounters unknown character and cut string at that very point. (I have tested Vietnamese characters on finnish windows).

So prefer Multi*-windows api function over analogue ansi C functions.

Also what I've noticed shortest way to encode string from one codepage to another is not use MultiByteToWideChar/WideCharToMultiByte api function calls but their analogue ATL macros: W2A / A2W.

So analogue function as mentioned above would sounds like:

wstring utf8toUtf16(const string & str)
{
   USES_CONVERSION;
   _acp = CP_UTF8;
   return A2W( str.c_str() );
}

_acp is declared in USES_CONVERSION macro.

Or also function which I often miss when performing old data conversion to new one:

string ansi2utf8( const string& s )
{
   USES_CONVERSION;
   _acp = CP_ACP;
   wchar_t* pw = A2W( s.c_str() );

   _acp = CP_UTF8;
   return W2A( pw );
}

But please notice that those macro's use heavily stack - don't use for loops or recursive loops for same function - after using W2A or A2W macro - better to return ASAP, so stack will be freed from temporary conversion.

余厌 2024-09-03 10:29:04

utf-8 实现

假设您的 std::string 是 utf8 编码的,这是 wstring-string 转换函数的独立于平台的实现:

#include <codecvt>
#include <locale>
#include <string>
#include <type_traits>

std::string wstring_to_utf8(std::wstring const& str)
{
  std::wstring_convert<std::conditional<
        sizeof(wchar_t) == 4,
        std::codecvt_utf8<wchar_t>,
        std::codecvt_utf8_utf16<wchar_t>>::type> converter;
  return converter.to_bytes(str);
}

std::wstring utf8_to_wstring(std::string const& str)
{
  std::wstring_convert<std::conditional<
        sizeof(wchar_t) == 4,
        std::codecvt_utf8<wchar_t>,
        std::codecvt_utf8_utf16<wchar_t>>::type> converter;
  return converter.from_bytes(str);
}

目前投票最多的 答案 看起来类似,但在非 Windows 平台上对非 BMP 字符(即表情符号

utf-8 implementation

Assuming that your std::string is utf8-encoded, this is a platform-independent implementation of wstring-string conversion functions:

#include <codecvt>
#include <locale>
#include <string>
#include <type_traits>

std::string wstring_to_utf8(std::wstring const& str)
{
  std::wstring_convert<std::conditional<
        sizeof(wchar_t) == 4,
        std::codecvt_utf8<wchar_t>,
        std::codecvt_utf8_utf16<wchar_t>>::type> converter;
  return converter.to_bytes(str);
}

std::wstring utf8_to_wstring(std::string const& str)
{
  std::wstring_convert<std::conditional<
        sizeof(wchar_t) == 4,
        std::codecvt_utf8<wchar_t>,
        std::codecvt_utf8_utf16<wchar_t>>::type> converter;
  return converter.from_bytes(str);
}

The currently most upvoted answer looks similar, but produces incorrect results for non-BMP characters (i.e. Emojis ????) on non-Windows platforms. wchar_t is UTF-16 on windows, but UTF-32 everywhere else. The std::conditional takes care of that distinction.

MSVC Deprecation Warning

On msvc this might generate some deprecation warnings. You can disable these by wrapping the functions in

#pragma warning(push)
#pragma warning(disable : 4996)
<the two functions>
#pragma warning(pop)

Johann Gerell's answer explains why it's ok to disable that warning.

Getting utf-8 on msvc

Note that when you write a normal string in your source (like std::string s = "おはよう";), it won't be utf-8 encoded per default on msvc. I would strongly recommend setting your msvc character set to utf-8 to address this:
https://learn.microsoft.com/en-us/cpp/build/reference/utf-8-set-source-and-executable-character-sets-to-utf-8?view=msvc-170

长亭外,古道边 2024-09-03 10:29:04

std::string -> wchar_t[] 具有安全的 mbstowcs_s 函数:

auto ws = std::make_unique<wchar_t[]>(s.size() + 1);
mbstowcs_s(nullptr, ws.get(), s.size() + 1, s.c_str(), s.size());

这是来自我的示例

std::string -> wchar_t[] with safe mbstowcs_s function:

auto ws = std::make_unique<wchar_t[]>(s.size() + 1);
mbstowcs_s(nullptr, ws.get(), s.size() + 1, s.c_str(), s.size());

This is from my sample code

物价感观 2024-09-03 10:29:04

使用此代码将字符串转换为 wstring

std::wstring string2wString(const std::string& s){
    int len;
    int slength = (int)s.length() + 1;
    len = MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, 0, 0); 
    wchar_t* buf = new wchar_t[len];
    MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, buf, len);
    std::wstring r(buf);
    delete[] buf;
    return r;
}

int main(){
    std::wstring str="your string";
    std::wstring wStr=string2wString(str);
    return 0;
}

use this code to convert your string to wstring

std::wstring string2wString(const std::string& s){
    int len;
    int slength = (int)s.length() + 1;
    len = MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, 0, 0); 
    wchar_t* buf = new wchar_t[len];
    MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, buf, len);
    std::wstring r(buf);
    delete[] buf;
    return r;
}

int main(){
    std::wstring str="your string";
    std::wstring wStr=string2wString(str);
    return 0;
}
浅唱々樱花落 2024-09-03 10:29:04

string s = "おはよう"; 是一个错误。

您应该直接使用 wstring:

wstring ws = L"おはよう";

string s = "おはよう"; is an error.

You should use wstring directly:

wstring ws = L"おはよう";
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文