如何存储循环缓冲区迭代器的中间值?

发布于 2024-11-29 17:54:50 字数 1740 浏览 1 评论 0原文

我在 boost 循环缓冲区上使用 boost 正则表达式,并且想“记住”匹配发生的位置,最好的方法是什么?我尝试了下面的代码,但“end”似乎始终存储相同的值!例如,当我尝试从以前的“结束”遍历到最近的“结束”时,它不起作用!

  boost::circular_buffer<char> cb(2048);
  typedef boost::circular_buffer<char>::iterator  ccb_iterator;
  boost::circular_buffer<ccb_iterator> cbi(4); 

  //just fill the whole cbi with cb.begin()  
  cbi.push_back(cb.begin());
  cbi.pushback(cb.begin());
  cbi.pushback(cb.begin());
  cbi.pushback(cb.begin());


 typedef regex_iterator<circular_buffer<char>::iterator> circular_regex_iterator;

 while (1)
{
  //insert new data in circular buffer (omitted)
  //basically reads data from file and pushes it back to cb

  boost::circular_buffer<char>::iterator    start,end;  

 circular_regex_iterator regexItr(
        cb.begin(), 
        cb.end() , 
         re, //expression of the regular expression
         boost::match_default | boost::match_partial); 
    circular_regex_iterator last;

    while(regexItr != last)
    {

            if((*regexItr)[0].matched == false)
           {
               //partial match      
               break;
            }
        else
        {
           // full match:
           start = (*regexItr)[0].first;
           end = (*regexItr)[0].second; 

             //I want to store these "end" positions to to use later so that I can 
             //traverse the buffer between these positions (matches).  

            //cbi stores positions of these matches, but this does not seem to work!                 
             cbi.push_back(end);    

            //for example, cbi[2] --> cbi[3] traversal works only first time this 
            //loop is run!
        }

        ++regexItr;
    }

}

I am a using a boost regex on a boost circular buffer and would like to "remember" positions where matches occur, what's the best way to do this? I tried the code below, but "end" seems to store the same values all the time! When I try to traverse from a previous "end" to the most recent "end" for example, it doesn't work!

  boost::circular_buffer<char> cb(2048);
  typedef boost::circular_buffer<char>::iterator  ccb_iterator;
  boost::circular_buffer<ccb_iterator> cbi(4); 

  //just fill the whole cbi with cb.begin()  
  cbi.push_back(cb.begin());
  cbi.pushback(cb.begin());
  cbi.pushback(cb.begin());
  cbi.pushback(cb.begin());


 typedef regex_iterator<circular_buffer<char>::iterator> circular_regex_iterator;

 while (1)
{
  //insert new data in circular buffer (omitted)
  //basically reads data from file and pushes it back to cb

  boost::circular_buffer<char>::iterator    start,end;  

 circular_regex_iterator regexItr(
        cb.begin(), 
        cb.end() , 
         re, //expression of the regular expression
         boost::match_default | boost::match_partial); 
    circular_regex_iterator last;

    while(regexItr != last)
    {

            if((*regexItr)[0].matched == false)
           {
               //partial match      
               break;
            }
        else
        {
           // full match:
           start = (*regexItr)[0].first;
           end = (*regexItr)[0].second; 

             //I want to store these "end" positions to to use later so that I can 
             //traverse the buffer between these positions (matches).  

            //cbi stores positions of these matches, but this does not seem to work!                 
             cbi.push_back(end);    

            //for example, cbi[2] --> cbi[3] traversal works only first time this 
            //loop is run!
        }

        ++regexItr;
    }

}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

溇涏 2024-12-06 17:54:50

这与其说是一个答案,不如说是试图重建你正在做的事情。我正在制作一个从字符串初始化的简单循环缓冲区,然后通过该缓冲区遍历正则表达式匹配并打印匹配的范围。一切似乎都运行良好。

我不建议将范围本身存储在循环缓冲区中;或者至少范围应该成对存储。

这是我的测试代码:(

#include <iostream>
#include <string>
#include <boost/circular_buffer.hpp>
#include <boost/regex.hpp>
#include "prettyprint.hpp"

typedef boost::circular_buffer<char> cb_char;
typedef boost::regex_iterator<cb_char::iterator> cb_char_regex_it;

int main()
{
  std::string sample = "Hello 12 Worlds 34 ! 56";
  cb_char cbc(8, sample.begin(), sample.end());

  std::cout << cbc << std::endl;    // (*)

  boost::regex expression("\\d+");  // just match numbers

  for (cb_char_regex_it m2, m1(cbc.begin(), cbc.end(), expression); m1 != m2; ++m1)
  {
    const auto & mr = *m1;
    std::cout << "--> " << mr << ", range ["
              << std::distance(cbc.begin(), mr[0].first) << ", "
              << std::distance(cbc.begin(), mr[0].second) << "]" << std::endl;
  }
}

这使用 漂亮打印机 打印原始循环缓冲区;您可以删除标记为 (*) 的行。)


更新:这是存储匹配项的可能方法:

typedef std::pair<std::size_t, std::size_t> match_range;
typedef std::vector<match_range>            match_ranges;

/* ... as before ... */

  match_ranges ranges;

  for (cb_char_regex_it m2, m1(cbc.begin(), cbc.end(), expression); m1 != m2; ++m1)
  {
    const auto & mr = *m1;

    ranges.push_back(match_range(std::distance(cbc.begin(), mr[0].first), std::distance(cbc.begin(), mr[0].second)));

    std::cout << "--> " << mr << ", range " << ranges.back() << std::endl;
  }

  std::cout << "All matching ranges: " << ranges << std::endl;

This isn't quite as much an answer as an attempt to reconstruct what you're doing. I'm making a simple circular buffer initialized from a string, and I traverse regex matches through that buffer and print the matched ranges. All seems to work fine.

I would not recommend storing the ranges themselves in a circular buffer; or at the very least the ranges should be stored in pairs.

Here's my test code:

#include <iostream>
#include <string>
#include <boost/circular_buffer.hpp>
#include <boost/regex.hpp>
#include "prettyprint.hpp"

typedef boost::circular_buffer<char> cb_char;
typedef boost::regex_iterator<cb_char::iterator> cb_char_regex_it;

int main()
{
  std::string sample = "Hello 12 Worlds 34 ! 56";
  cb_char cbc(8, sample.begin(), sample.end());

  std::cout << cbc << std::endl;    // (*)

  boost::regex expression("\\d+");  // just match numbers

  for (cb_char_regex_it m2, m1(cbc.begin(), cbc.end(), expression); m1 != m2; ++m1)
  {
    const auto & mr = *m1;
    std::cout << "--> " << mr << ", range ["
              << std::distance(cbc.begin(), mr[0].first) << ", "
              << std::distance(cbc.begin(), mr[0].second) << "]" << std::endl;
  }
}

(This uses the pretty printer to print the raw circular buffer; you can remove the line marked (*).)


Update: Here's a possible way to store the matches:

typedef std::pair<std::size_t, std::size_t> match_range;
typedef std::vector<match_range>            match_ranges;

/* ... as before ... */

  match_ranges ranges;

  for (cb_char_regex_it m2, m1(cbc.begin(), cbc.end(), expression); m1 != m2; ++m1)
  {
    const auto & mr = *m1;

    ranges.push_back(match_range(std::distance(cbc.begin(), mr[0].first), std::distance(cbc.begin(), mr[0].second)));

    std::cout << "--> " << mr << ", range " << ranges.back() << std::endl;
  }

  std::cout << "All matching ranges: " << ranges << std::endl;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文