C++删除字符串上的标点符号,erase()/迭代器问题

发布于 2024-11-06 22:22:13 字数 1553 浏览 0 评论 0原文

我知道我不是第一个提出反向迭代器试图在字符串上调用擦除()方法的问题的人。但是,我找不到任何好的方法来解决这个问题。

我正在读取一个文件的内容,其中包含一堆单词。当我读入一个单词时,我想将其传递给一个名为 stripPunct 的函数。但是,我只想删除字符串开头和结尾的标点符号,而不是中间的标点符号。

例如:

(word) 应该删除 '(' 和 ')',结果只是 word

不! 应该删除 '!'导致只是

所以我的逻辑(我确信可以改进)是有两个 while 循环,一个从末尾开始,一个从开头开始,遍历和擦除直到它命中非标点符号字符。

void stripPunct(string & str) {
    string::iterator itr1 = str.begin();
    string::reverse_iterator itr2 = str.rbegin();

    while ( ispunct(*itr1) ) {
        str.erase(itr1);
        itr1++;
    }

    while ( ispunct(*itr2) ) {
        str.erase(itr2);
        itr2--;
    }
}

但是,显然它不起作用,因为擦除()需要常规迭代器而不是反向迭代器。但无论如何,我觉得这个逻辑效率很低。

另外,我尝试使用常规迭代器代替反向迭代器,在 str.end() 处启动它,然后递减它,但它说如果我在 str.end() 处启动它,我无法取消引用迭代器。

谁能帮我提供一个好方法来做到这一点?或者也许指出我已有的解决方法?

提前非常感谢!

------------------ [编辑] ----------------------------

找到一个解决方案,尽管它可能不是最好的解决方案:

// Call the stripPunct method:

stripPunct(str);
if ( !str.empty() ) { // make sure string is still valid
  // perform other code
}

这是 stripPunct 方法:

void stripPunct(string & str) {
   string::iterator itr1 = str.begin();
   string::iterator itr2 = str.end();

   while ( !(str.empty()) && ispunct(*itr1) ) 
       itr1 = str.erase(itr1);

   itr2--;
   if ( itr2 != str.begin() ) {

       while ( !(str.empty()) && ispunct(*itr2) ) {
           itr2 = str.erase(itr2);
           itr2--;
       }
   }
}

I know I'm not the first person to bring up the issue with reverse iterators trying to call the erase() method on strings. However, I wasn't able to find any good ways around this.

I'm reading the contents of a file, which contains a bunch of words. When I read in a word, I want to pass it to a function I have called stripPunct. However, I ONLY want to strip punctuation at the beginning and end of a string, not in the middle.

So for instance:

(word) should strip '(' and ')' resulting in just word

don't! should strip '!' resulting in just don't

So my logic (which I'm sure could be improved) was to have two while loops, one starting at the end and one at the beginning, traversing and erasing until it hits a non-punctuation char.

void stripPunct(string & str) {
    string::iterator itr1 = str.begin();
    string::reverse_iterator itr2 = str.rbegin();

    while ( ispunct(*itr1) ) {
        str.erase(itr1);
        itr1++;
    }

    while ( ispunct(*itr2) ) {
        str.erase(itr2);
        itr2--;
    }
}

However, obviously it's not working because erase() requires a regular iterator and not a reverse_iterator. But anyways, I feel like that logic is pretty inefficient.

Also, I tried instead of a reverse_iterator using just a regular iterator, starting it at str.end(), then decremented it, but it says I cannot dereference the iterator if I start it at str.end().

Can anyone help me with a good way to do this? Or maybe point out a workaround for what I already have?

Thank you so much in advance!

------------------ [ EDIT ] ----------------------------

found a solution, although it may not be the best solution:

// Call the stripPunct method:

stripPunct(str);
if ( !str.empty() ) { // make sure string is still valid
  // perform other code
}

And here is the stripPunct method:

void stripPunct(string & str) {
   string::iterator itr1 = str.begin();
   string::iterator itr2 = str.end();

   while ( !(str.empty()) && ispunct(*itr1) ) 
       itr1 = str.erase(itr1);

   itr2--;
   if ( itr2 != str.begin() ) {

       while ( !(str.empty()) && ispunct(*itr2) ) {
           itr2 = str.erase(itr2);
           itr2--;
       }
   }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

她如夕阳 2024-11-13 22:22:13

首先,请注意代码中的几个问题:

  • 在使用 itr1 调用 erase() 后,您已使 itr2 失效。
  • 当使用 reverse_iterator 向后遍历序列时,您需要使用 ++,而不是 -- (这就是反向迭代器的原因存在)。

现在,为了改进逻辑,您可以通过找到您不想删除的第一个字符并删除该点之前的所有内容来避免单独删除每个字符。 find_if() 可用于帮助解决此问题:

int not_punct(char c) {
    return !ispunct((unsigned char) c);
}

void stripPunct(string & str) {
    string::iterator itr = find_if( str.begin(), str.end(), not_punct);

    str.erase( str.begin(), itr);

    string::reverse_iterator ritr = find_if( str.rbegin(), str.rend(), not_punct);

    str.erase( ritr.base(), str.end());
}

请注意,我已使用 base() 来获取与 reverse_iterator 对应的“常规”迭代器代码>.我发现 base() 是否需要调整的逻辑令人困惑(反向迭代器通常让我感到困惑)——在这种情况下,它不需要,因为我们碰巧想在字符之后开始擦除成立。

Scott Meyers 的这篇文章 http://drdobbs.com/cpp/184401406 对部分中的 reverse_iterator::base() 。 “指南 3:了解如何使用反向迭代器的基本迭代器”。该文章中的信息也被纳入 Meyer 的“Effective STL”一书中。

First, note a couple problems with your code:

  • after you call erase() using itr1, you've invalidated itr2.
  • when using a reverse_iterator to go backwards through a sequence, you want to use ++, not -- (that's kind of the reason reverse iterators exist).

Now, to improve the logic, you can avoid erasing each character individually by finding the first charater you don't want to erase and erase everything up to that point. find_if() can be used to help with that:

int not_punct(char c) {
    return !ispunct((unsigned char) c);
}

void stripPunct(string & str) {
    string::iterator itr = find_if( str.begin(), str.end(), not_punct);

    str.erase( str.begin(), itr);

    string::reverse_iterator ritr = find_if( str.rbegin(), str.rend(), not_punct);

    str.erase( ritr.base(), str.end());
}

Note that I've used base() to get the 'regular' iterator corresponding to the reverse_iterator. I find the logic for whether base() needs to be adjusted confusing (reverse iterators in general confuse me)- in this case it doesn't because we happen to want to start the erase after the character that's found.

This article by Scott Meyers, http://drdobbs.com/cpp/184401406, has a good treatment of reverse_iterator::base() in the section. "Guideline 3: Understand How to Use a reverse_iterator's Base iterator". The information in that article has also been incorporated into Meyer's "Effective STL" book.

紫轩蝶泪 2024-11-13 22:22:13

您无法取消引用 iterator::end() 因为它指向无效内存(数组末尾后面的内存),因此您必须先将其递减。

最后一点:如果这个单词只包含标点符号,你的程序将会失败,一定要处理好这个问题。

You can't dereference iterator::end() because it points to invalid memory (memory right after the end of the array), so you have to decrement it first.

And one final note: if the word consists only of punctuations, your program will fail, be sure to handle that.

云淡月浅 2024-11-13 22:22:13

如果您不介意负逻辑,您可以执行以下操作:

string tmp_str="";
tmp_str.reserve(str.length());
for (string::iterator itr1 = str.begin(); itr1 != str.end(); itr1++)
{
   if (!ispunct(*itr1))
   {
      tmp_str.push_back(*itr1);
   }
}
str = tmp_str;

If you don't mind negative logic, you can do the following:

string tmp_str="";
tmp_str.reserve(str.length());
for (string::iterator itr1 = str.begin(); itr1 != str.end(); itr1++)
{
   if (!ispunct(*itr1))
   {
      tmp_str.push_back(*itr1);
   }
}
str = tmp_str;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文