UTF8 转换

发布于 2024-10-31 18:28:44 字数 708 浏览 3 评论 0原文

我需要生成一个 UTF8 字符串来传递给第 3 方库,但我在找出正确的体操时遇到了麻烦...此外,更糟糕的是,我一直在使用 C++ Builder 6,而且我发现的每个示例都在谈论使用 CBuilder6 显然不支持的 std::string 。我想在不使用STL 的情况下完成这个任务。

到目前为止,这是我的代码,我似乎无法工作。

wchar_t *SS1;
char *SS2;

  SS1 = L"select * from mnemonics;";

  int strsize =  WideCharToMultiByte(CP_UTF8, 0, SS1, wcslen(SS1), NULL, 0, NULL, NULL);

  SS2 = new char[strsize+1];

  WideCharToMultiByte( CP_UTF8, 0, SS1, wcslen(SS1), SS2, strsize, NULL, NULL);

当我将 SS2 作为参数传递给第 3 方库时,它会卡住。显然,我在 Windows 平台上使用 Microsoft 的 WideCharToMultiByte 但最终我不想需要这个函数调用,因为此代码也必须在嵌入式平台上以及 Linux 下编译,但当我到达它时我会跨过那座桥。

目前,我只需要能够将 wchar_t 或 char 转换为 UTF8 编码字符串,最好不使用任何 STL。我不会在嵌入式平台上使用STL。

谢谢!

I need to generate a UTF8 string to pass to a 3rd party library and I'm having trouble figuring out the right gymnastics... Also, to make matters worst, I'm stuck using C++ Builder 6 and every example I found talks about using std::string which CBuilder6 evidentially has no support for. I'd like to accomplish this without using STL what so ever.

Here is my code so far that I can't seem to make work.

wchar_t *SS1;
char *SS2;

  SS1 = L"select * from mnemonics;";

  int strsize =  WideCharToMultiByte(CP_UTF8, 0, SS1, wcslen(SS1), NULL, 0, NULL, NULL);

  SS2 = new char[strsize+1];

  WideCharToMultiByte( CP_UTF8, 0, SS1, wcslen(SS1), SS2, strsize, NULL, NULL);

3rd party library chokes when I pass it SS2 as a parameter. Obviously, I'm on a Windows platform using Microsoft's WideCharToMultiByte but eventually I would like to not need this function call as this code must also be compiled on an embedded platform as well under Linux but I'll cross that bridge when I get to it.

For now, I just need to be able to either convert a wchar_t or char to UTF8 encoded string preferably without using any STL. I won't have STL on the embedded platform.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

半葬歌 2024-11-07 18:28:44

类似这样的:

extern void someFunctionThatAcceptsUTF8(const char* utf8);

const char* ss1 = "string in system default multibyte encoding";

someFunctionThatAcceptsUTF8( w2u( a2w(ss1) ) ); // that conversion you need:
                                                 // a2w: "ansi" -> widechar string
                                                 // w2u: widechar string -> utf8 string.

您只需要获取并包含此文件:
http://code.google.com /p/tiscript/source/browse/trunk/sdk/include/aux-cvt.h

它应该可以在 Builder 上正常工作。

Something like that:

extern void someFunctionThatAcceptsUTF8(const char* utf8);

const char* ss1 = "string in system default multibyte encoding";

someFunctionThatAcceptsUTF8( w2u( a2w(ss1) ) ); // that conversion you need:
                                                 // a2w: "ansi" -> widechar string
                                                 // w2u: widechar string -> utf8 string.

You just need to grab and include this file:
http://code.google.com/p/tiscript/source/browse/trunk/sdk/include/aux-cvt.h

It should work on Builder just fine.

峩卟喜欢 2024-11-07 18:28:44

如果您仍在寻找答案,这里有一个用 C 语言实现的 utf8 转换器的简单实现:

/*
** Transforms a wchar to utf-8, returning a string of converted bytes
*/

void            ft_to_utf8(wchar_t c, unsigned char *buffer)
{
    if (c < (1 << 7))
        *buffer++ = (unsigned char)(c);
    else if (c < (1 << 11))
    {
        *buffer++ = (unsigned char)((c >> 6) | 0xC0);
        *buffer++ = (unsigned char)((c & 0x3F) | 0x80);
    }
    else if (c < (1 << 16))
    {
        *buffer++ = (unsigned char)((c >> 12) | 0xE0);
        *buffer++ = (unsigned char)(((c >> 6) & 0x3F) | 0x80);
        *buffer++ = (unsigned char)((c & 0x3F) | 0x80);
    }
    else if (c < (1 << 21))
    {
        *buffer++ = (unsigned char)((c >> 18) | 0xF0);
        *buffer++ = (unsigned char)(((c >> 12) & 0x3F) | 0x80);
        *buffer++ = (unsigned char)(((c >> 6) & 0x3F) | 0x80);
        *buffer++ = (unsigned char)((c & 0x3F) | 0x80);
    }
    *buffer = '\0';
}

If you're still looking for an answer, here's a simple implementation of a utf8 convertor in C language:

/*
** Transforms a wchar to utf-8, returning a string of converted bytes
*/

void            ft_to_utf8(wchar_t c, unsigned char *buffer)
{
    if (c < (1 << 7))
        *buffer++ = (unsigned char)(c);
    else if (c < (1 << 11))
    {
        *buffer++ = (unsigned char)((c >> 6) | 0xC0);
        *buffer++ = (unsigned char)((c & 0x3F) | 0x80);
    }
    else if (c < (1 << 16))
    {
        *buffer++ = (unsigned char)((c >> 12) | 0xE0);
        *buffer++ = (unsigned char)(((c >> 6) & 0x3F) | 0x80);
        *buffer++ = (unsigned char)((c & 0x3F) | 0x80);
    }
    else if (c < (1 << 21))
    {
        *buffer++ = (unsigned char)((c >> 18) | 0xF0);
        *buffer++ = (unsigned char)(((c >> 12) & 0x3F) | 0x80);
        *buffer++ = (unsigned char)(((c >> 6) & 0x3F) | 0x80);
        *buffer++ = (unsigned char)((c & 0x3F) | 0x80);
    }
    *buffer = '\0';
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文