关于 UTF16 和 WCS 之间转换的 Unicode ICU4C 问题(仅限 OS-X)

发布于 2024-12-10 21:39:26 字数 1759 浏览 0 评论 0原文

我在 Windows、Linux 和 Mac-OSX 下构建的 C++ 软件中使用 ICU4C。 我仅在 Mac-OSX 下遇到问题,并且仅与 UTF16 和 WCS 之间的转换(调用 u_strToWCS )有关。 只需将 unicode 字符替换为固定字符即可。

ICU4C 版本并不重要:我昨天尝试了最新的版本。

我的 Mac OS-X 是 10.6.6 (Snow Leopard),GCC:i686-apple-darwin10-gcc-4.2.1。

我还尝试在共享库和静态库之间切换,而不进行任何更改。

我用下面的代码重现了该问题。查看变量“c1”、“c2”和“c3”:Windows 和 Linux 给出相同的结果,Mac OS-X 则不然(我的问题)。

我不明白这是一个编译问题,还是 icu bug,还是其他什么。

我希望有人能给我建议一个方向,或者至少确认我的测试结果。

谢谢。

    // Manually construct UTF16 buffer of this string: http://pastebin.com/HW06TaA9
    unsigned char* pSource = new unsigned char[28];
    pSource[0] = 84;
    pSource[1] = 0;
    pSource[2] = 101;
    pSource[3] = 0;
    pSource[4] = 115;
    pSource[5] = 0;
    pSource[6] = 116;
    pSource[7] = 0;
    pSource[8] = 32;
    pSource[9] = 0;
    pSource[10] = 179;
    pSource[11] = 111;
    pSource[12] = 128;
    pSource[13] = 149;
    pSource[14] = 121;
    pSource[15] = 114;
    pSource[16] = 43;
    pSource[17] = 82;
    pSource[18] = 76;
    pSource[19] = 136;
    pSource[20] = 63;
    pSource[21] = 101;
    pSource[22] = 64;
    pSource[23] = 83;
    pSource[24] = 125;
    pSource[25] = 0;
    pSource[26] = 0;
    pSource[27] = 0;

    int32_t nChars = 100;
    wchar_t* pDest = new wchar_t[nChars];
    memset(pDest, 0, nChars * sizeof(wchar_t));

    UErrorCode status = U_ZERO_ERROR;
    u_strToWCS(pDest, nChars, &nChars, (const UChar*) pSource, -1, &status);
    if(U_SUCCESS(status))
    {
        wchar_t c1 = pDest[2];  // Ascii char. Win: 115, Linux: 115, OS-X: 115 
        wchar_t c2 = pDest[5];  // Japan char. Win: 28595, Linux: 28595, OS-X: 26
        wchar_t c3 = pDest[6];   // Japan char. Win: 38272, Linux: 38272, OS-X: 26
    }

I use ICU4C in a C++ software builded under Windows, Linux and Mac-OSX.
I have an issue ONLY under Mac-OSX, and only related to conversion between UTF16 and WCS (calling u_strToWCS ).
Simply the unicoded chars are replaced with a fixed char.

The ICU4C version doesn't matter: i try the lastest yesterday.

My Mac OS-X is 10.6.6 (Snow Leopard), GCC: i686-apple-darwin10-gcc-4.2.1.

I also try to switch between shared libraries and static libraries, without any changes.

I reproduce the issue with the code below. Look the variables "c1", "c2" and "c3": Windows and Linux give the same result, Mac OS-X not (my issue).

I don't understand if is a compilation problem, or icu bug, or whatelse.

I hope anyone can suggest to me a direction, or at least confirm my test results.

Thanks.

    // Manually construct UTF16 buffer of this string: http://pastebin.com/HW06TaA9
    unsigned char* pSource = new unsigned char[28];
    pSource[0] = 84;
    pSource[1] = 0;
    pSource[2] = 101;
    pSource[3] = 0;
    pSource[4] = 115;
    pSource[5] = 0;
    pSource[6] = 116;
    pSource[7] = 0;
    pSource[8] = 32;
    pSource[9] = 0;
    pSource[10] = 179;
    pSource[11] = 111;
    pSource[12] = 128;
    pSource[13] = 149;
    pSource[14] = 121;
    pSource[15] = 114;
    pSource[16] = 43;
    pSource[17] = 82;
    pSource[18] = 76;
    pSource[19] = 136;
    pSource[20] = 63;
    pSource[21] = 101;
    pSource[22] = 64;
    pSource[23] = 83;
    pSource[24] = 125;
    pSource[25] = 0;
    pSource[26] = 0;
    pSource[27] = 0;

    int32_t nChars = 100;
    wchar_t* pDest = new wchar_t[nChars];
    memset(pDest, 0, nChars * sizeof(wchar_t));

    UErrorCode status = U_ZERO_ERROR;
    u_strToWCS(pDest, nChars, &nChars, (const UChar*) pSource, -1, &status);
    if(U_SUCCESS(status))
    {
        wchar_t c1 = pDest[2];  // Ascii char. Win: 115, Linux: 115, OS-X: 115 
        wchar_t c2 = pDest[5];  // Japan char. Win: 28595, Linux: 28595, OS-X: 26
        wchar_t c3 = pDest[6];   // Japan char. Win: 38272, Linux: 38272, OS-X: 26
    }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

伴梦长久 2024-12-17 21:39:26

看到这个错误。这个问题已经在最新的 ICU 中得到修复,我已经提供了一个解决方法。 https://ssl.icu-project.org/trac/ticket/8894 #评论:4

See this bug. This had been already fixed in latest ICU and I've included a workaround. https://ssl.icu-project.org/trac/ticket/8894#comment:4

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文