测试 wchar_t* 的可转换字符

发布于 2024-12-01 18:39:45 字数 906 浏览 1 评论 0原文

我正在与一个将字符串作为 wchar_t 数组处理的库进行交流。我需要将它们转换为 char 数组,以便我可以将它们交给 Python(使用 SWIG 和 Python 的 PyString_FromString 函数)。显然并非所有宽字符都可以转换为字符。根据 wcstombs 的文档,我应该能够做一些事情,比如

wcstombs(NULL, wideString, wcslen(wideString))

测试字符串中是否有不可转换的字符——如果有的话,它应该返回 -1。然而,在我的测试用例中它总是返回-1。这是我的测试函数:

void getString(wchar_t* target, int size) {
    int i;
    for(i = 0; i < size; ++i) {
        target[i] = L'a' + i;
    }
    printf("Generated %d characters, nominal length %d, compare %d\n", size, 
            wcslen(target), wcstombs(NULL, target, size));
}    

这会生成这样的输出:

Generated 32 characters, nominal length 39, compare -1
Generated 16 characters, nominal length 20, compare -1
Generated 4 characters, nominal length 6, compare -1

知道我做错了什么吗?

与此相关的是,如果您知道一种直接从 wchar_t*s 转换为 Python unicode 字符串的方法,那就太好了。 :) 谢谢!

I'm working on talking to a library that handles strings as wchar_t arrays. I need to convert these to char arrays so that I can hand them over to Python (using SWIG and Python's PyString_FromString function). Obviously not all wide characters can be converted to chars. According to the documentation for wcstombs, I ought to be able to do something like

wcstombs(NULL, wideString, wcslen(wideString))

to test the string for unconvertable characters -- it's supposed to return -1 if there are any. However, in my test case it's always returning -1. Here's my test function:

void getString(wchar_t* target, int size) {
    int i;
    for(i = 0; i < size; ++i) {
        target[i] = L'a' + i;
    }
    printf("Generated %d characters, nominal length %d, compare %d\n", size, 
            wcslen(target), wcstombs(NULL, target, size));
}    

This is generating output like this:

Generated 32 characters, nominal length 39, compare -1
Generated 16 characters, nominal length 20, compare -1
Generated 4 characters, nominal length 6, compare -1

Any idea what I'm doing wrong?

On a related note, if you know of a way to convert directly from wchar_t*s to Python unicode strings, that'd be welcome. :) Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冷…雨湿花 2024-12-08 18:39:45

显然,正如您所发现的,以零终止输入数据至关重要。

关于最后一段,我将从 Wide 转换为 UTF8 并调用 PyUnicode_FromString

请注意,我假设您使用的是 Python 2.x,在 Python 3.x 中可能完全不同。

Clearly, as you found, it's essential to zero-terminate your input data.

Regarding the final paragraph, I would convert from wide to UTF8 and call PyUnicode_FromString.

Note that I am assuming you are using Python 2.x, it's presumably all different in Python 3.x.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文