将 PCRE 正则表达式中的任意数量的单词匹配到字符串中

发布于 2024-07-23 12:04:18 字数 99 浏览 3 评论 0原文

我正在使用 PCRE 进行一些正则表达式解析,我需要在字符串中搜索特定模式的单词(假设用逗号分隔的单词字符串中的所有单词)并将它们放入字符串向量中。

我该怎么做呢?

I am using PCRE for some regex parsing and I need to search a string for words in a specific pattern (let's say all words in a string of words separated by commas) and put them into a string vector.

How would I go about doing that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

陈独秀 2024-07-30 12:04:18

抱歉,代码很粗糙,但我很着急......

  pcre* re;
  const char *error;
  int   erroffset;
  char* subject = txt;
  int   ovector[3];
  int   subject_length = strlen(subject);
  int rc = 0;


  re = pcre_compile(
  "\\w+",              /* the pattern */
  PCRE_CASELESS|PCRE_MULTILINE,                    /* default options */
  &error,               /* for error message */
  &erroffset,           /* for error offset */
  NULL);                /* use default character tables */

  char* pofs = subject;
  while (  rc >= 0  ) {
    rc = pcre_exec(
      re,                   /* the compiled pattern */
      NULL,                 /* no extra data - we didn't study the pattern */
      subject,              /* the subject string */
      subject_length,       /* the length of the subject */
      0,                    /* start at offset 0 in the subject */
      0,                    /* default options */
      ovector,              /* output vector for substring information */
      3);           /* number of elements in the output vector */

    /*
    if (rc < 0) {
      switch(rc) {
        case PCRE_ERROR_NOMATCH: printf("No match\n"); break;

        // Handle other special cases if you like

        default: printf("Matching error %d\n", rc); break;
      }
      pcre_free(re);     // Release memory used for the compiled pattern
      return;
    }
    */

    /* Match succeded */

    if (  rc >= 0  ) {
      pofs += ovector[1];

      char *substring_start = subject + ovector[0];

      // do something with the substring

      int substring_length = ovector[1] - ovector[0];

      subject = pofs;
      subject_length -= ovector[1];
    }
  }

Sorry for the rough code, but I am in a hurry...

  pcre* re;
  const char *error;
  int   erroffset;
  char* subject = txt;
  int   ovector[3];
  int   subject_length = strlen(subject);
  int rc = 0;


  re = pcre_compile(
  "\\w+",              /* the pattern */
  PCRE_CASELESS|PCRE_MULTILINE,                    /* default options */
  &error,               /* for error message */
  &erroffset,           /* for error offset */
  NULL);                /* use default character tables */

  char* pofs = subject;
  while (  rc >= 0  ) {
    rc = pcre_exec(
      re,                   /* the compiled pattern */
      NULL,                 /* no extra data - we didn't study the pattern */
      subject,              /* the subject string */
      subject_length,       /* the length of the subject */
      0,                    /* start at offset 0 in the subject */
      0,                    /* default options */
      ovector,              /* output vector for substring information */
      3);           /* number of elements in the output vector */

    /*
    if (rc < 0) {
      switch(rc) {
        case PCRE_ERROR_NOMATCH: printf("No match\n"); break;

        // Handle other special cases if you like

        default: printf("Matching error %d\n", rc); break;
      }
      pcre_free(re);     // Release memory used for the compiled pattern
      return;
    }
    */

    /* Match succeded */

    if (  rc >= 0  ) {
      pofs += ovector[1];

      char *substring_start = subject + ovector[0];

      // do something with the substring

      int substring_length = ovector[1] - ovector[0];

      subject = pofs;
      subject_length -= ovector[1];
    }
  }
故人如初 2024-07-30 12:04:18

std::string wordstring = "w1, w2, w3";
std::string word;
pcrecpp::StringPiece inp_w(wordstring);
pcrecpp::RE w_re("(\\S+),?\\s*");
std::vector outwords;

while (w_re.FindAndConsume(&inp_w, &word)) {
    outwords.push_back(word);
}

std::string wordstring = "w1, w2, w3";
std::string word;
pcrecpp::StringPiece inp_w(wordstring);
pcrecpp::RE w_re("(\\S+),?\\s*");
std::vector outwords;

while (w_re.FindAndConsume(&inp_w, &word)) {
    outwords.push_back(word);
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文