如何在 C 或 C++ 中反转字符串?
如何在 C 或 C++ 中反转字符串而不需要单独的缓冲区来保存反转的字符串?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
如何在 C 或 C++ 中反转字符串而不需要单独的缓冲区来保存反转的字符串?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(21)
这是C++中最简单的方法。
This is the simplest way in C++.
阅读克尼根和里奇的作品
Read Kernighan and Ritchie
标准算法是使用指向开始/结束的指针,并向内移动它们,直到它们在中间相遇或交叉。 随走随换。
反向 ASCII 字符串,即以 0 结尾的数组,其中每个字符适合 1 个
char
。 (或其他非多字节字符集)。相同的算法适用于已知长度的整数数组,只需使用
tail = start + length - 1
而不是末端查找循环。(编者注:这个答案最初也使用 XOR-swap 来实现这个简单的版本。为了这个流行问题的未来读者的利益而修复。XOR 交换强烈不推荐;难以阅读,并且会降低代码的编译效率。您可以查看在 Godbolt 编译器浏览器上,当使用 gcc 为 x86-64 编译 xor-swap 时,asm 循环体要复杂得多-O3。)
好吧,好吧,让我们修复 UTF-8 字符...
(这是 XOR 交换的事情。请注意,您必须避免与 self 交换,因为如果
*p
和*q
是相同的位置,您可以使用 a^a==0 将其清零,这取决于是否有两个不同的位置,并将它们各自用作临时存储。 )编者注:您可以使用 tmp 变量将 SWP 替换为安全内联函数。
示例:
The standard algorithm is to use pointers to the start / end, and walk them inward until they meet or cross in the middle. Swap as you go.
Reverse ASCII string, i.e. a 0-terminated array where every character fits in 1
char
. (Or other non-multibyte character sets).The same algorithm works for integer arrays with known length, just use
tail = start + length - 1
instead of the end-finding loop.(Editor's note: this answer originally used XOR-swap for this simple version, too. Fixed for the benefit of future readers of this popular question. XOR-swap is highly not recommended; hard to read and making your code compile less efficiently. You can see on the Godbolt compiler explorer how much more complicated the asm loop body is when xor-swap is compiled for x86-64 with gcc -O3.)
Ok, fine, let's fix the UTF-8 chars...
(This is XOR-swap thing. Take care to note that you must avoid swapping with self, because if
*p
and*q
are the same location you'll zero it with a^a==0. XOR-swap depends on having two distinct locations, using them each as temporary storage.)Editor's note: you can replace SWP with a safe inline function using a tmp variable.
Examples:
非邪恶的 C,假设常见情况是字符串是一个以 null 结尾的
char
数组:Non-evil C, assuming the common case where the string is a null-terminated
char
array:已经有一段时间了,我不记得哪本书教了我这个算法,但我认为它非常巧妙且易于理解:
该算法的可视化,由 slashdottir:
It's been a while and I don't remember which book taught me this algorithm, but I thought it was quite ingenious and simple to understand:
A visualization of this algorithm, courtesy of slashdottir:
请注意,std::reverse 的优点在于它可以与
char *
字符串和std::wstring
一起使用,就像std::string
一样。代码>sNote that the beauty of std::reverse is that it works with
char *
strings andstd::wstring
s just as well asstd::string
s如果您正在寻找反转 NULL 终止的缓冲区,这里发布的大多数解决方案都可以。 但是,正如 Tim Farley 已经指出的那样,只有当字符串在语义上是字节数组(即单字节字符串)的假设有效时,这些算法才会起作用,我认为这是一个错误的假设。
以字符串“año”(西班牙语中的年份)为例。
Unicode 代码点为 0x61、0xf1、0x6f。
考虑一些最常用的编码:
Latin1 / iso-8859-1(单字节编码,1 个字符就是 1 个字节,反之亦然):
UTF-8:
UTF-16 Big Endian:< /强>
UTF-16 Little Endian:
If you're looking for reversing NULL terminated buffers, most solutions posted here are OK. But, as Tim Farley already pointed out, these algorithms will work only if it's valid to assume that a string is semantically an array of bytes (i.e. single-byte strings), which is a wrong assumption, I think.
Take for example, the string "año" (year in Spanish).
The Unicode code points are 0x61, 0xf1, 0x6f.
Consider some of the most used encodings:
Latin1 / iso-8859-1 (single byte encoding, 1 character is 1 byte and vice versa):
UTF-8:
UTF-16 Big Endian:
UTF-16 Little Endian:
为了完整起见,应该指出的是,在各种平台上都有字符串的表示形式,其中每个字符的字节数因字符而异。 老派程序员将其称为DBCS(双字节字符集)。 现代程序员在UTF-8(以及UTF-16 等)。 还有其他类似的编码。
在任何这些可变宽度编码方案中,这里发布的简单算法(邪恶,非邪恶或否则)根本无法正常工作! 事实上,它们甚至可能导致字符串变得难以辨认,甚至在该编码方案中成为非法字符串。 请参阅胡安·巴勃罗·卡利法诺的回答< /a> 一些很好的例子。
在这种情况下,std::reverse() 可能仍然可以工作,只要您的平台的标准 C++ 库(特别是字符串迭代器)的实现正确考虑到了这一点。
In the interest of completeness, it should be pointed out that there are representations of strings on various platforms in which the number of bytes per character varies depending on the character. Old-school programmers would refer to this as DBCS (Double Byte Character Set). Modern programmers more commonly encounter this in UTF-8 (as well as UTF-16 and others). There are other such encodings as well.
In any of these variable-width encoding schemes, the simple algorithms posted here (evil, non-evil or otherwise) would not work correctly at all! In fact, they could even cause the string to become illegible or even an illegal string in that encoding scheme. See Juan Pablo Califano's answer for some good examples.
std::reverse() potentially would still work in this case, as long as your platform's implementation of the Standard C++ Library (in particular, string iterators) properly took this into account.
另一种 C++ 方式(尽管我自己可能会使用 std::reverse() :) 因为更具表现力和更快)
C 方式(或多或少 :) )
请注意交换的 XOR 技巧,
编译器有时无法优化它。
在这种情况下通常会慢得多。
Another C++ way (though I would probably use std::reverse() myself :) as being more expressive and faster)
The C way (more or less :) )
and please, be careful about XOR trick for swapping,
compilers sometimes cannot optimize that.
In such case it is usually much slower.
此代码产生以下输出:
This code produces this output:
如果您使用 GLib,它有两个函数,g_strreverse() 和 g_utf8_strreverse()
In case you are using GLib, it has two functions for that, g_strreverse() and g_utf8_strreverse()
我喜欢 Evgeny 的 K&R 答案。 不过,很高兴看到使用指针的版本。 否则,它本质上是相同的:
I like Evgeny's K&R answer. However, it is nice to see a version using pointers. Otherwise, it's essentially the same:
用于反转字符串的递归函数(无需额外的缓冲区、malloc)。
简短、性感的代码。 糟糕、糟糕的堆栈使用。
Recursive function to reverse a string in place (no extra buffer, malloc).
Short, sexy code. Bad, bad stack usage.
如果您使用 ATL/MFC
CString
,只需调用CString::MakeReverse()
。If you are using ATL/MFC
CString
, simply callCString::MakeReverse()
.完后还有:
Yet another:
C++ 多字节 UTF-8 反向器
我的想法是,你永远不能只交换结尾,你必须始终从头到尾移动,遍历字符串并查找“这个字符需要多少字节?” 我从原始结束位置开始附加字符,并从字符串的前面删除该字符。
C++ multi-byte UTF-8 reverser
My thought is that you can never just swap ends, you must always move from beginning-to-end, move through the string and look for "how many bytes will this character require?" I attach the character starting at the original end position, and remove the character from the front of the string.
在 C++ 中,可以在函数中完成相反的操作:
In C++ the reverse can be done in a function:
输入字符串,返回字符串,无需其他库
input string, return string, No other library required
如果不需要存储,可以这样减少花费的时间:
If you don't need to store it, you can reduce the time spent like this: