在 C++ 中将宽字符字符串转换为小写字母

发布于 2024-08-08 06:32:06 字数 609 浏览 11 评论 0原文

如何在 C++ 中将 wchar_t 字符串从大写转换为小写?

该字符串包含日语、中文、德语和希腊字符的混合体。

我考虑过使用 towlower...

http://msdn .microsoft.com/en-us/library/8h19t214%28VS.80%29.aspx

..但文档说:

towlower 的大小写转换是特定于区域设置的。仅更改与当前语言环境相关的字符,以防大小写变化。

编辑:也许我应该描述一下我在做什么。我收到用户发来的 Unicode 搜索查询。它最初采用 UTF-8 编码,但我将其转换为 Widechar(我的措辞可能有误)。我的调试器(VS2008)正确显示“变量快速监视”中的日语、德语等字符。我需要检查另一组 Unicode 数据并找到搜索字符串的匹配项。虽然当搜索区分大小写时,这对我来说没有问题,但不区分大小写则更成问题。我解决问题的(也许是幼稚的)方法是将所有输入数据和输出数据转换为小写,然后进行比较。

How do I convert a wchar_t string from upper case to lower case in C++?

The string contains a mixture of Japanese, Chinese, German and Greek characters.

I thought about using towlower...

http://msdn.microsoft.com/en-us/library/8h19t214%28VS.80%29.aspx

.. but the documentation says that:

The case conversion of towlower is locale-specific. Only the characters relevant to the current locale are changed in case.

Edit: Maybe I should describe what I'm doing. I receive a Unicode search query from a user. It's originally in UTF-8 encoding, but I'm converting it to a widechar (I may be wrong on the wording). My debugger (VS2008) correctly shows the Japanese, German, etc characters in in the "variable quick watch". I need to go through another set of data in Unicode and find matches of the search string. While this is no problem for me to do when the search is case sensitive, it's more problematic to do it case insensitive. My (maybe naive) approach to solve the problem would be to convert all input data and output data to lower case and then compare it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

子栖 2024-08-15 06:32:06

如果您的字符串包含所有这些字符,则代码集必须基于 Unicode。如果实施得当,Unicode(第 4 章'字符属性')定义了字符属性,包括字符是否大写以及小写映射等。

鉴于该序言, 中的 tolower() 函数是正确使用的工具。如果它不能完成工作,您就需要与供应商讨论 QoI(实施质量)问题。如果您发现供应商没有响应,请查看替代库。在这种情况下,您可以考虑 ICU(Unicode 国际组件)。

If your string contains all those characters, the codeset must be Unicode-based. If implemented properly, Unicode (Chapter 4 'Character Properties') defines character properties including whether the character is upper case and the lower case mapping, and so on.

Given that preamble, the towlower() function from <wctype.h> is the correct tool to use. If it doesn't do the job, you have a QoI (Quality of Implementation) problem to discuss with your vendor. If you find the vendor unresponsive, then look at alternative libraries. In this case, you might consider ICU (International Components for Unicode).

深巷少女 2024-08-15 06:32:06

你手头有一个棘手的问题。日语语言环境无助于转换德语,反之亦然。有些语言也没有大写的概念(我想,toupper 和朋友们在这里是不可以操作的)。那么,您可以将字符串分解为来自同一语言的各个单词块吗?如果可以的话,你可以转换这些碎片并将它们串起来。

You have a nasty problem in hand. A Japanese locale will not help converting German and vice versa. There are languages which do not have the concept of captalization either (toupper and friends would be a no-op here, I suppose). So, can you break up your string into individual chunks of words from the same language? If you can then you can convert the pieces and string them up.

仅一夜美梦 2024-08-15 06:32:06

这个答案展示了如何使用构面来处理多个区域设置。如果这是在 Windows 上,您可以考虑使用 win32 API 函数,如果您可以使用 C++.NET(托管 C++),则可以使用 char.ToLowerstring.ToLower 函数,符合 Unicode。

This SO answer shows how to work with facets to work with several locales. If this is on Windows, you can consider using win32 API functions, if you can work with C++.NET (managed C++), you can use the char.ToLower and string.ToLower functions, which are Unicode compliant.

古镇旧梦 2024-08-15 06:32:06

查看 中的 _wcslwr_l (MSDN)。

您应该能够在每个区域设置的输入上运行该函数。

Have a look at _wcslwr_l in <wchar.h> (MSDN).

You should be able to run the function on the input for each of the locales.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文