C 中不区分大小写的字符串比较
我有两个邮政编码 char*
我想比较,忽略大小写。 有一个函数可以做到这一点吗?
或者我是否必须循环遍历每个使用 tolower
函数,然后进行比较?
知道这个函数将如何对字符串中的数字做出反应
谢谢
I have two postcodes char*
that I want to compare, ignoring case.
Is there a function to do this?
Or do I have to loop through each use the tolower
function and then do the comparison?
Any idea how this function will react with numbers in the string
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
如果我们有一个空终止字符:
或者使用按位运算的这个版本:
我不确定这是否适用于符号,我没有在那里进行测试,但适用于字母。
if we have a null terminated character:
or with this version that uses bitwise operations:
i'm not sure if this works with symbols, I haven't tested there, but works fine with letters.
祝你好运,
Edit-lowerCaseWord 函数获取一个 char* 变量,并返回该 char* 的小写值。例如,对于 char* 的值“AbCdE”,将返回“abcde”。
基本上它的作用是获取两个 char* 变量,在转换为小写后,并对它们使用 strcmp 函数。
例如,如果我们对“AbCdE”和“ABCDE”的值调用 strcmpInsensitive 函数,它将首先返回两个小写值(“abcde”),然后对它们执行 strcmp 函数。
good luck
Edit-lowerCaseWord function get a char* variable with, and return the lower case value of this char*. For example "AbCdE" for value of char*, will return "abcde".
Basically what it does is to take the two char* variables, after being transferred to lower case, and make use the strcmp function on them.
For example- if we call the strcmpInsensitive function for values of "AbCdE", and "ABCDE", it will first return both values in lower case ("abcde"), and then do strcmp function on them.
参考
Reference
C 标准中没有函数可以执行此操作。符合 POSIX 的 Unix 系统需要有
strcasecmp
< /a> 头文件strings.h
;微软系统有stricmp
。为了便于移植,请编写自己的解决方案:但请注意,这些解决方案都不适用于 UTF-8 字符串,只能使用 ASCII 字符串。
There is no function that does this in the C standard. Unix systems that comply with POSIX are required to have
strcasecmp
in the headerstrings.h
; Microsoft systems havestricmp
. To be on the portable side, write your own:But note that none of these solutions will work with UTF-8 strings, only ASCII ones.
查看
strcasecmp()
< /a> 在strings.h
中。Take a look at
strcasecmp()
instrings.h
.在进行不区分大小写的比较时需要注意的其他陷阱:
比较小写还是大写? (很常见的问题)
以下两者都将返回 0
strcicmpL("A", "a")
和strcicmpU("A", "a")
.然而,
strcicmpL("A", "_")
和strcicmpU("A", "_")
可以返回不同的签名结果,如'_' 通常位于大小写字母之间。
这会影响与
qsort(..., ..., ..., strcicmp)
一起使用时的排序顺序。非标准库 C 函数(例如常用的stricmp()
或strcasecmp()
)往往定义良好,并且倾向于通过小写进行比较。但存在差异。char
可以有负值。 (并不罕见)touppper(int)
和tolower(int)
指定为unsigned char
值和负EOF。此外,
strcmp()
返回的结果就好像每个char
都转换为unsigned char
,无论char
是否为 < em>已签名或无签名。char
可以为负值,但不能为 2 的补码。 (罕见)上面的代码不能正确处理
-0
或其他负值,因为位模式应解释为unsigned char
。要正确处理所有整数编码,请首先更改指针类型。区域设置(不太常见)
尽管使用 ASCII 代码 (0-127) 的字符集普遍存在,但其余代码往往存在区域设置特定问题。因此,strcasecmp("\xE4", "a") 可能在一个系统上返回 0,而在另一个系统上返回非零。
Unicode(未来之路)
如果解决方案需要处理的内容不只 ASCII,请考虑使用
unicode_strcicmp()
。由于 C lib 不提供此类函数,因此建议使用某些备用库中的预编码函数。编写自己的unicode_strcicmp()
是一项艰巨的任务。所有字母都从下到上映射吗? (迂腐)
[AZ] 与 [az] 一对一映射,但各种区域设置将各种小写字符映射到一个大写字符,反之亦然。此外,某些大写字符可能缺少对应的小写字符,反之亦然。
这迫使代码通过
tolower()
和tolower()
进行隐藏。同样,如果代码执行
tolower(toupper(*a))
与toupper(tolower(*a))
,则排序可能会产生不同的结果。可移植性
@B。 Nadolson 建议避免使用自己的
strcicmp()
,这是合理的,除非代码需要高度等效的可移植功能。下面是一种甚至比某些系统提供的功能执行得更快的方法。它在每个循环中进行一次比较,而不是使用 2 个与
'\0'
不同的不同表进行两次比较。您的结果可能会有所不同。Additional pitfalls to watch out for when doing case insensitive compares:
Comparing as lower or as upper case? (common enough issue)
Both below will return 0 with
strcicmpL("A", "a")
andstrcicmpU("A", "a")
.Yet
strcicmpL("A", "_")
andstrcicmpU("A", "_")
can return different signed results as'_'
is often between the upper and lower case letters.This affects the sort order when used with
qsort(..., ..., ..., strcicmp)
. Non-standard library C functions like the commonly availablestricmp()
orstrcasecmp()
tend to be well defined and favor comparing via lowercase. Yet variations exist.char
can have a negative value. (not rare)touppper(int)
andtolower(int)
are specified forunsigned char
values and the negativeEOF
. Further,strcmp()
returns results as if eachchar
was converted tounsigned char
, regardless ifchar
is signed or unsigned.char
can have a negative value and not 2's complement. (rare)The above does not handle
-0
nor other negative values properly as the bit pattern should be interpreted asunsigned char
. To properly handle all integer encodings, change the pointer type first.Locale (less common)
Although character sets using ASCII code (0-127) are ubiquitous, the remainder codes tend to have locale specific issues. So
strcasecmp("\xE4", "a")
might return a 0 on one system and non-zero on another.Unicode (the way of the future)
If a solution needs to handle more than ASCII consider a
unicode_strcicmp()
. As C lib does not provide such a function, a pre-coded function from some alternate library is recommended. Writing your ownunicode_strcicmp()
is a daunting task.Do all letters map one lower to one upper? (pedantic)
[A-Z] maps one-to-one with [a-z], yet various locales map various lower case chracters to one upper and visa-versa. Further, some uppercase characters may lack a lower case equivalent and again, visa-versa.
This obliges code to covert through both
tolower()
andtolower()
.Again, potential different results for sorting if code did
tolower(toupper(*a))
vs.toupper(tolower(*a))
.Portability
@B. Nadolson recommends to avoid rolling your own
strcicmp()
and this is reasonable, except when code needs high equivalent portable functionality.Below is an approach that even performed faster than some system provided functions. It does a single compare per loop rather than two by using 2 different tables that differ with
'\0'
. Your results may vary.我发现名为 from 的内置方法包含标准 header 的附加字符串函数。
这是相关的签名:
我还发现它在 xnu 内核 (osfmk/device/subrs.c) 中是同义词,并且在以下代码中实现,因此与原始 strcmp 函数相比,您不会期望在数量上有任何行为变化。
I've found built-in such method named from which contains additional string functions to the standard header .
Here's the relevant signatures :
I also found it's synonym in xnu kernel (osfmk/device/subrs.c) and it's implemented in the following code, so you wouldn't expect to have any change of behavior in number compared to the original strcmp function.
我会使用
stricmp()
。它比较两个字符串而不考虑大小写。请注意,在某些情况下,将字符串转换为小写可能会更快。
I would use
stricmp()
. It compares two strings without regard to case.Note that, in some cases, converting the string to lower case can be faster.
正如其他人所说,没有可移植的功能适用于所有系统。您可以使用简单的
ifdef
部分规避此问题:As others have stated, there is no portable function that works on all systems. You can partially circumvent this with simple
ifdef
:C 中
strcasecmp()
和strncasecmp()
的 POSIX
头文件替换更新 2024 年 7 月 25 日:
我的最新工作现在在这里:
上面的库包含我的
my_strcasecmp()
和my_strncasecmp 的实现()
并使用 Gtest 直接针对非 C 标准 POSIX 标头中包含的 POSIX 函数strcasecmp()
和strncasecmp()
进行测试文件名为strings.h
。要测试和运行:
如果在 Linux 中,请使用常规 Bash 终端。如果在 Windows 中,请使用 MSYS2 终端。请参阅我的说明:安装并安装从头开始设置 MSYS2,包括将所有 7 个配置文件添加到 Windows 终端
首先,按照我的说明安装 Gtest:
如何通过 gcc/g++ 或 clang 构建和使用 googletest (gtest) 和 googlemock (gmock)?
然后,克隆我的存储库并将单元测试文件作为 Bash 脚本运行:
示例运行和输出:
strncmpci()
,直接、插入式不区分大小写字符串比较替换strncmp()
和strcmp()
我并不是
这是
strncmp()
,并已通过大量测试用例进行了测试,如下所示。它与
strncmp()
相同,除了:strncmp()
具有未定义的行为(请参阅:https://en.cppreference.com/w/cpp/string/byte/strncmp)。NULL
ptr,它将返回INT_MIN
作为特殊的哨兵错误值。限制:请注意,此代码仅适用于原始7 位 ASCII 字符集(十进制值 0 到 127,包括在内),不适用于 unicode 字符,例如 unicode字符编码 UTF-8 (最流行),UTF-16 和 UTF-32。
这里只是代码(无注释):
完整注释版本:
测试代码:
从我的 eRCaGuy_hello_world 存储库下载完整的示例代码以及单元测试:“strncmpci.c":(
这只是一个片段)
示例输出:
参考文献:
进一步研究的主题
TODO:
POSIX
<strings.h>
header file replacement forstrcasecmp()
andstrncasecmp()
in CUpdate 25 July 2024:
My latest work on this is now here:
The above library contains my implementations of
my_strcasecmp()
andmy_strncasecmp()
and uses Gtest to test them directly against the POSIX functionsstrcasecmp()
andstrncasecmp()
which are contained in the non-C-standard POSIX header file namedstrings.h
.To test and run:
If in Linux, use your regular Bash terminal. If in Windows, use the MSYS2 terminal. See my instructions here: Installing & setting up MSYS2 from scratch, including adding all 7 profiles to Windows Terminal
First, install Gtest by following my instructions here:
How do I build and use googletest (gtest) and googlemock (gmock) with gcc/g++ or clang?
Then, clone my repo and run the unit test file as a Bash script:
Example run and output:
strncmpci()
, a direct, drop-in case-insensitive string comparison replacement forstrncmp()
andstrcmp()
I'm not really a fan of the most-upvoted answer here (in part because it seems like it isn't correct since it should
continue
if it reads a null terminator in either string--but not both strings at once--and it doesn't do this), so I wrote my own.This is a direct drop-in replacement for
strncmp()
, and has been tested with numerous test cases, as shown below.It is identical to
strncmp()
except:strncmp()
has undefined behavior if either string is a null ptr (see: https://en.cppreference.com/w/cpp/string/byte/strncmp).INT_MIN
as a special sentinel error value if either input string is aNULL
ptr.LIMITATIONS: Note that this code works on the original 7-bit ASCII character set only (decimal values 0 to 127, inclusive), NOT on unicode characters, such as unicode character encodings UTF-8 (the most popular), UTF-16, and UTF-32.
Here is the code only (no comments):
Fully-commented version:
Test code:
Download the entire sample code, with unit tests, from my eRCaGuy_hello_world repository here: "strncmpci.c":
(this is just a snippet)
Sample output:
References:
Topics to further research
TODO:
如果库中没有任何库,您可以从 此处
它对所有 256 个字符使用一个表格。
然后我们只需要遍历字符串并比较表格单元格中给定的字符:
You can get an idea, how to implement an efficient one, if you don't have any in the library, from here
It use a table for all 256 chars.
then we just need to traverse a strings and compare our table cells for a given chars:
简单的解决方案:
Simple solution: