根据给定的 Boost token_iterator 识别原始字符串中的位置

发布于 2024-11-04 11:53:01 字数 419 浏览 2 评论 0原文

如果使用 Boost 分词器处理字符串，是否可以获取给定标记迭代器所指向的原始字符串中的位置：

boost:tokenizer<> tok( "this is the original string" );
for(tokenizer<>::iterator it=tok.begin(); it!=tok.end();++it)
{
    std::string strToken = *it;
    int charPos = it.?                /* IS THERE A METHOD? */
}

我意识到我可以使用定义的“保留分隔符”列表创建一个特定的 char_separator 并指定 keep_empty_tokens尝试自己跟踪迭代器的进度，但我希望有一种更简单的方法，仅使用迭代器本身。

原文

If a string has been processed using a Boost tokenizer is it possible to get the position in the original string that a given token iterator is pointing to:

boost:tokenizer<> tok( "this is the original string" );
for(tokenizer<>::iterator it=tok.begin(); it!=tok.end();++it)
{
    std::string strToken = *it;
    int charPos = it.?                /* IS THERE A METHOD? */
}

I realize I could create a specific char_separator with a defined list of 'kept delimiters' and specify keep_empty_tokens to try and track the progression of the iterator myself but I was hoping there was an easier way using just the iterator itself.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

无妨# 2024-11-11 11:53:04

如果您只需要当前令牌的结尾，则使用 base() 成员函数
可能会达到目的：

std::string s = "this is the original string";
boost::tokenizer<> tok(s);
for(boost::tokenizer<>::iterator it=tok.begin(); it!=tok.end();++it)
{
    int charPos = it.base() - s.begin();
}

不幸的是，似乎没有办法找回开头
boost::tokenizer 中的当前令牌。

If you need only the end of current token, base() member function
might meet the purpose:

std::string s = "this is the original string";
boost::tokenizer<> tok(s);
for(boost::tokenizer<>::iterator it=tok.begin(); it!=tok.end();++it)
{
    int charPos = it.base() - s.begin();
}

Unfortunately, there seems not to be the way to retrieve the beginning
of current token in boost::tokenizer.

回复收藏 0 原文

一个人的旅程 2024-11-11 11:53:04

怎么样：

 int charPos = it - tok.begin() ;

How about:

 int charPos = it - tok.begin() ;

回复收藏 0 原文

爱你是孤单的心事 2024-11-11 11:53:03

这似乎就是您正在寻找的内容：

#include <string>
#include <iostream>
#include <boost/tokenizer.hpp>

int main()
{
  typedef boost::tokenizer<> tok_t;

  std::string const s = "this is the original string";
  tok_t const tok(s);
  for (tok_t::const_iterator it = tok.begin(), it_end = tok.end(); it != it_end; ++it)
  {
    std::string::difference_type const offset = it.base() - s.begin() - it->size();
    std::cout << offset << "\t::\t" << *it << '\n';
  }
}

在线演示

This appears to be what you're looking for:

#include <string>
#include <iostream>
#include <boost/tokenizer.hpp>

int main()
{
  typedef boost::tokenizer<> tok_t;

  std::string const s = "this is the original string";
  tok_t const tok(s);
  for (tok_t::const_iterator it = tok.begin(), it_end = tok.end(); it != it_end; ++it)
  {
    std::string::difference_type const offset = it.base() - s.begin() - it->size();
    std::cout << offset << "\t::\t" << *it << '\n';
  }
}

Online Demo

回复收藏 0 原文

~没有更多了~