我不能数多个角色

发布于 2025-02-06 11:45:24 字数 466 浏览 3 评论 0 原文

我正在尝试构建一个函数,该函数计算文本中有多少个句子,具体取决于来决定句子的结尾。

出于某种原因,无论有多少函数的功能都不会计算超过一个,并且只有在整个文本的末尾。

这是功能:

int count_sentences(string text) {
    int count = 0;
    for (int i = 0; i < strlen(text); i++) {
        if (strcmp(&text[i], "?") == 0 || strcmp(&text[i], "!") == 0
            || strcmp(&text[i], ".") == 0) {
            count += 1;
        }
    }
    return count;
}

I'm trying to build a function that counts how many sentences there are in a text depending on the ?, !, and . to decide the end of the sentence.

For some reason no matter how many of them there are the function doesn't count more than one and only if it was at the end of the whole text.

This is the function:

int count_sentences(string text) {
    int count = 0;
    for (int i = 0; i < strlen(text); i++) {
        if (strcmp(&text[i], "?") == 0 || strcmp(&text[i], "!") == 0
            || strcmp(&text[i], ".") == 0) {
            count += 1;
        }
    }
    return count;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

如梦 2025-02-13 11:45:24

strcmp 用于比较零终止字符串。您只想比较字符。

让我们假设文本包含”。ABC“

在第一次迭代期间(当 i US 0时),&amp; text [i] 指向字符串“。ABC” and strcmp(&amp; text [i],“。”)实际上将字符串“。abc” 与字符串“。”。当然,不相等。

您的if语句应该这样:

if ((text[i] == '?') || text[i] == '!') || text[i] == '.'))

strcmp is for comparing null terminated strings. You just want to compare characters.

Let's assume text contains ".ABC".

During the first iteration (when ius 0), &text[i] points to the string ".ABC" and strcmp(&text[i], ".") actually compares the string ".ABC" to the string "." and they are, of course, not equal.

Your if statement should be like this:

if ((text[i] == '?') || text[i] == '!') || text[i] == '.'))
ぃ双果 2025-02-13 11:45:24

使用函数 strcmp

if (strcmp(&text[i], "?") == 0 || strcmp(&text[i], "!") == 0 || strcmp(&text[i], ".") == 0)

没有意义。如果字符串 text 中的最后一个字符将在if语句中的表达式进行评估,前提是最后一个字符等于字符'?'?''!''。'

相反,您应该使用标准C函数 strspn strcspn 。例如,

size_t count_sentences( string text )
{
    const char *delim = "?!.";

    size_t count = 0;

    for ( text += strcspn( text, delim ); *text != '\0'; text += strcspn( text, delim ) )
    {
        ++count;
        text += strspn( text, delim );
    }

    return count;
}

该函数将返回1字符串“执行...” 。确实,尽管有三个字符'。',只有一个语句。

这是一个演示程序。

#include <stdio.h>
#include <string.h>

typedef char *string;

size_t count_sentences( string text )
{
    const char *delim = "?!.";

    size_t count = 0;

    for ( text += strcspn( text, delim ); *text != '\0'; text += strcspn( text, delim ) )
    {
        ++count;
        text += strspn( text, delim );
    }

    return count;
}

int main(void) 
{
    string text = "Strange... Why are you using strcmp?! Use strspn and strcspn!!!";

    printf( "The text\n\"%s\"\ncontains %zu sentences.\n",
        text, count_sentences( text ) );

    return 0;
}

程序输出是

The text
"Strange... Why are you using strcmp?! Use strspn and strcspn!!!"
contains 3 sentences.

Using the function strcmp

if (strcmp(&text[i], "?") == 0 || strcmp(&text[i], "!") == 0 || strcmp(&text[i], ".") == 0)

does not make a sense. The expression in the if statement will be evaluated to true only for the last character in the string text provided that the last character is equal to one of the characters '?', '!' and '.'.

Instead you should use standard C functions strspn and strcspn. For example

size_t count_sentences( string text )
{
    const char *delim = "?!.";

    size_t count = 0;

    for ( text += strcspn( text, delim ); *text != '\0'; text += strcspn( text, delim ) )
    {
        ++count;
        text += strspn( text, delim );
    }

    return count;
}

The function will return 1 for example for the string "Executing...". Indeed there is only one statement though there are three characters '.'.

Here is a demonstration program.

#include <stdio.h>
#include <string.h>

typedef char *string;

size_t count_sentences( string text )
{
    const char *delim = "?!.";

    size_t count = 0;

    for ( text += strcspn( text, delim ); *text != '\0'; text += strcspn( text, delim ) )
    {
        ++count;
        text += strspn( text, delim );
    }

    return count;
}

int main(void) 
{
    string text = "Strange... Why are you using strcmp?! Use strspn and strcspn!!!";

    printf( "The text\n\"%s\"\ncontains %zu sentences.\n",
        text, count_sentences( text ) );

    return 0;
}

The program output is

The text
"Strange... Why are you using strcmp?! Use strspn and strcspn!!!"
contains 3 sentences.
晨敛清荷 2025-02-13 11:45:24

您的代码不起作用,因为&amp; text [i] 不是1个字符字符串,而是 text 从Offset 开始的字符串部分的指针我。当您观察到时,只有最后一个字符才能正确测试。

您应该以这种方式而不是字符串,而是应该这样的单个字符:

int count_sentences(const char *text) {
    int count = 0;
    for (int i = 0; text[i] != '\0'; i++) {
        if (text[i] == '?' || text[i] == '!' || text[i] == '.') {
            count += 1;
        }
    }
    return count;
}

但是,请注意,对于不以句子分离器结束的字符串,此代码不会正常工作,如果多个分隔符在一起:“ Hello” > ,“ Hello!” “对不起...”

计算此类句子的诀窍是列举从分离器到非分离器的过渡。此方法可以计算单词,行等。

这是一个修改版本:

int isterminator(int c) {
    return (c == '?' || c == '!' || c == '.');
}
        
int count_sentences(const char *text) {
    char last = '.';
    int count = 0;
    for (size_t i = 0; text[i] != '\0'; i++) {
        if (isterminator(last) && !isterminator(text[i])) {
            count += 1;
        }
        last = text[i];
    }
    return count;
}

Your code does not work because &text[i] is not a 1 character string, but a pointer to the part of the string in text starting at offset i. Only the last character will be tested correctly as you observe.

Instead of strings, you should instead individual characters this way:

int count_sentences(const char *text) {
    int count = 0;
    for (int i = 0; text[i] != '\0'; i++) {
        if (text[i] == '?' || text[i] == '!' || text[i] == '.') {
            count += 1;
        }
    }
    return count;
}

Note however that this code will not work as expected for strings that do not end with a sentence separator, neither if multiple separators occur together: "hello", "Hello!!!", "Sorry...".

The trick to count such sentences is to enumerate transitions from separators to non separators. This method works to count words, lines etc.

Here is a modified version:

int isterminator(int c) {
    return (c == '?' || c == '!' || c == '.');
}
        
int count_sentences(const char *text) {
    char last = '.';
    int count = 0;
    for (size_t i = 0; text[i] != '\0'; i++) {
        if (isterminator(last) && !isterminator(text[i])) {
            count += 1;
        }
        last = text[i];
    }
    return count;
}
分开我的手 2025-02-13 11:45:24
int count_sentences(string text)
{
    int count = 0;
    for (int i = 0; i < text.size(); ++i) {
        if (text[i] == '?' || text[i] == '!' || text[i] == '.')
            ++count;
    }

    return count;
}

int main()
{
    string sentence = "abc. def? ghi!";
    cout << count_sentences(sentence) << endl;
}
int count_sentences(string text)
{
    int count = 0;
    for (int i = 0; i < text.size(); ++i) {
        if (text[i] == '?' || text[i] == '!' || text[i] == '.')
            ++count;
    }

    return count;
}

int main()
{
    string sentence = "abc. def? ghi!";
    cout << count_sentences(sentence) << endl;
}
远山浅 2025-02-13 11:45:24

您传递的指针使您比较以下字符。使用字符串[index] =='?'比较单个字符。

    /* Compare S1 and S2, returning less than, equal to or
   greater than zero if S1 is lexicographically less than,
   equal to or greater than S2.  */
int[enter link description here][1]
STRCMP (const char *p1, const char *p2)
{
  const unsigned char *s1 = (const unsigned char *) p1;
  const unsigned char *s2 = (const unsigned char *) p2;
  unsigned char c1, c2;
  do
    {
      c1 = (unsigned char) *s1++;
      c2 = (unsigned char) *s2++;
      if (c1 == '\0')
        return c1 - c2;
    }
  while (c1 == c2);
  return c1 - c2;
}

The pointer you passed in causes you to compare the following characters. Use string[index] == '?' to compare individual characters.

    /* Compare S1 and S2, returning less than, equal to or
   greater than zero if S1 is lexicographically less than,
   equal to or greater than S2.  */
int[enter link description here][1]
STRCMP (const char *p1, const char *p2)
{
  const unsigned char *s1 = (const unsigned char *) p1;
  const unsigned char *s2 = (const unsigned char *) p2;
  unsigned char c1, c2;
  do
    {
      c1 = (unsigned char) *s1++;
      c2 = (unsigned char) *s2++;
      if (c1 == '\0')
        return c1 - c2;
    }
  while (c1 == c2);
  return c1 - c2;
}

https://code.woboq.org/userspace/glibc/string/strcmp.c.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文