C++删除字符串上的标点符号,erase()/迭代器问题
我知道我不是第一个提出反向迭代器试图在字符串上调用擦除()方法的问题的人。但是,我找不到任何好的方法来解决这个问题。
我正在读取一个文件的内容,其中包含一堆单词。当我读入一个单词时,我想将其传递给一个名为 stripPunct 的函数。但是,我只想删除字符串开头和结尾的标点符号,而不是中间的标点符号。
例如:
(word) 应该删除 '(' 和 ')',结果只是 word
不! 应该删除 '!'导致只是 不
所以我的逻辑(我确信可以改进)是有两个 while 循环,一个从末尾开始,一个从开头开始,遍历和擦除直到它命中非标点符号字符。
void stripPunct(string & str) {
string::iterator itr1 = str.begin();
string::reverse_iterator itr2 = str.rbegin();
while ( ispunct(*itr1) ) {
str.erase(itr1);
itr1++;
}
while ( ispunct(*itr2) ) {
str.erase(itr2);
itr2--;
}
}
但是,显然它不起作用,因为擦除()需要常规迭代器而不是反向迭代器。但无论如何,我觉得这个逻辑效率很低。
另外,我尝试使用常规迭代器代替反向迭代器,在 str.end() 处启动它,然后递减它,但它说如果我在 str.end() 处启动它,我无法取消引用迭代器。
谁能帮我提供一个好方法来做到这一点?或者也许指出我已有的解决方法?
提前非常感谢!
------------------ [编辑] ----------------------------
找到一个解决方案,尽管它可能不是最好的解决方案:
// Call the stripPunct method:
stripPunct(str);
if ( !str.empty() ) { // make sure string is still valid
// perform other code
}
这是 stripPunct 方法:
void stripPunct(string & str) {
string::iterator itr1 = str.begin();
string::iterator itr2 = str.end();
while ( !(str.empty()) && ispunct(*itr1) )
itr1 = str.erase(itr1);
itr2--;
if ( itr2 != str.begin() ) {
while ( !(str.empty()) && ispunct(*itr2) ) {
itr2 = str.erase(itr2);
itr2--;
}
}
}
I know I'm not the first person to bring up the issue with reverse iterators trying to call the erase() method on strings. However, I wasn't able to find any good ways around this.
I'm reading the contents of a file, which contains a bunch of words. When I read in a word, I want to pass it to a function I have called stripPunct. However, I ONLY want to strip punctuation at the beginning and end of a string, not in the middle.
So for instance:
(word) should strip '(' and ')' resulting in just word
don't! should strip '!' resulting in just don't
So my logic (which I'm sure could be improved) was to have two while loops, one starting at the end and one at the beginning, traversing and erasing until it hits a non-punctuation char.
void stripPunct(string & str) {
string::iterator itr1 = str.begin();
string::reverse_iterator itr2 = str.rbegin();
while ( ispunct(*itr1) ) {
str.erase(itr1);
itr1++;
}
while ( ispunct(*itr2) ) {
str.erase(itr2);
itr2--;
}
}
However, obviously it's not working because erase() requires a regular iterator and not a reverse_iterator. But anyways, I feel like that logic is pretty inefficient.
Also, I tried instead of a reverse_iterator using just a regular iterator, starting it at str.end(), then decremented it, but it says I cannot dereference the iterator if I start it at str.end().
Can anyone help me with a good way to do this? Or maybe point out a workaround for what I already have?
Thank you so much in advance!
------------------ [ EDIT ] ----------------------------
found a solution, although it may not be the best solution:
// Call the stripPunct method:
stripPunct(str);
if ( !str.empty() ) { // make sure string is still valid
// perform other code
}
And here is the stripPunct method:
void stripPunct(string & str) {
string::iterator itr1 = str.begin();
string::iterator itr2 = str.end();
while ( !(str.empty()) && ispunct(*itr1) )
itr1 = str.erase(itr1);
itr2--;
if ( itr2 != str.begin() ) {
while ( !(str.empty()) && ispunct(*itr2) ) {
itr2 = str.erase(itr2);
itr2--;
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
首先,请注意代码中的几个问题:
itr1
调用erase()
后,您已使itr2
失效。reverse_iterator
向后遍历序列时,您需要使用++
,而不是--
(这就是反向迭代器的原因存在)。现在,为了改进逻辑,您可以通过找到您不想删除的第一个字符并删除该点之前的所有内容来避免单独删除每个字符。
find_if()
可用于帮助解决此问题:请注意,我已使用
base()
来获取与reverse_iterator
对应的“常规”迭代器代码>.我发现base()
是否需要调整的逻辑令人困惑(反向迭代器通常让我感到困惑)——在这种情况下,它不需要,因为我们碰巧想在字符之后开始擦除成立。Scott Meyers 的这篇文章 http://drdobbs.com/cpp/184401406 对部分中的
reverse_iterator::base()
。 “指南 3:了解如何使用反向迭代器的基本迭代器”。该文章中的信息也被纳入 Meyer 的“Effective STL”一书中。First, note a couple problems with your code:
erase()
usingitr1
, you've invalidateditr2
.reverse_iterator
to go backwards through a sequence, you want to use++
, not--
(that's kind of the reason reverse iterators exist).Now, to improve the logic, you can avoid erasing each character individually by finding the first charater you don't want to erase and erase everything up to that point.
find_if()
can be used to help with that:Note that I've used
base()
to get the 'regular' iterator corresponding to thereverse_iterator
. I find the logic for whetherbase()
needs to be adjusted confusing (reverse iterators in general confuse me)- in this case it doesn't because we happen to want to start the erase after the character that's found.This article by Scott Meyers, http://drdobbs.com/cpp/184401406, has a good treatment of
reverse_iterator::base()
in the section. "Guideline 3: Understand How to Use a reverse_iterator's Base iterator". The information in that article has also been incorporated into Meyer's "Effective STL" book.您无法取消引用 iterator::end() 因为它指向无效内存(数组末尾后面的内存),因此您必须先将其递减。
最后一点:如果这个单词只包含标点符号,你的程序将会失败,一定要处理好这个问题。
You can't dereference iterator::end() because it points to invalid memory (memory right after the end of the array), so you have to decrement it first.
And one final note: if the word consists only of punctuations, your program will fail, be sure to handle that.
如果您不介意负逻辑,您可以执行以下操作:
If you don't mind negative logic, you can do the following: