C++计数并绘制地图

发布于 2024-12-14 13:13:21 字数 187 浏览 0 评论 0 原文

我正在计算每个单词在文本文件中出现的次数。我想避免出现这种情况,因此我会减少我的输入,然后进行计数。我有一个地图数据结构,其中有 string 和 int 来保持计数。现在,当我输出单词及其计数时,我不希望单词为小写,而是希望它保持其原始大小写。因此,为了计算所有单词应该更改为小写,但在输出时它们都应该是原来的大小写。有没有办法仅使用一张地图来实现这一目标?

I am counting the number of times every word occurs in a text file. I would like to avoid cases and hence am doing tolower to my input and then counting. I have a map data structure having string and int to keep count. Now, when I output the word and its count, I don't want the word to be in lower case, but want it to maintain its original case. So, for counting all the words should change to lowercase but while giving output they all should be in their original case. Is there anyway to achieve this with using only one map?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

只是一片海 2024-12-21 13:13:21

std::map 的第三个模板参数是比较器类型。您可以提供自己的比较操作,在您的情况下是不区分大小写的。

struct CaseInsensitive {
  bool operator()(std::string const& left, std::string const& right) const {
    size_t const size = std::min(left.size(), right.size());

    for (size_t i = 0; i != size; ++i) {
      char const lowerLeft = std::tolower(left[i]);
      char const lowerRight = std::tolower(right[i]);

      if (lowerLeft < lowerRight) { return true; }
      if (lowerLeft > lowerRight) { return false; }

      // if equal? continue!
    }

    // same prefix? then we compare the length
    return left.size() < right.size();
  }
};

然后实例化你的地图:

typedef std::map<std::string, unsigned, CaseInsensitive> MyWordCountingMap;

注意:仅保留第一个拼写(这对你来说似乎没问题)

The third template parameter of std::map is a comparator type. You can provide your own comparison operation, in your case a case-insensitive one.

struct CaseInsensitive {
  bool operator()(std::string const& left, std::string const& right) const {
    size_t const size = std::min(left.size(), right.size());

    for (size_t i = 0; i != size; ++i) {
      char const lowerLeft = std::tolower(left[i]);
      char const lowerRight = std::tolower(right[i]);

      if (lowerLeft < lowerRight) { return true; }
      if (lowerLeft > lowerRight) { return false; }

      // if equal? continue!
    }

    // same prefix? then we compare the length
    return left.size() < right.size();
  }
};

Then instanciate your map:

typedef std::map<std::string, unsigned, CaseInsensitive> MyWordCountingMap;

Note: only the first spelling is preserved (which seems okay with you)

時窥 2024-12-21 13:13:21

这应该有效。对于多种情况,第一种情况将在地图内,而不是小写。此外,该解决方案仅根据您的需要使用一张地图

using namespace std;

struct StrCaseInsensitive
{
    bool operator() (const string& left , const string& right )
    {
        return _stricmp( left.c_str() , right.c_str() ) < 0;
    }
};

int main(void)
{
    char* input[] = { "Foo" , "bar" , "Bar" , "FOO" };
    std::map<string, int , StrCaseInsensitive> CountMap;

    for( int i = 0 ; i < 4; ++i )
    {
        CountMap[ input[i] ] += 1;
    }
    return 0;
}

This should work. For multiple cases the first case will be inside the map and not lower case. Also the solution uses only one map as you wanted

using namespace std;

struct StrCaseInsensitive
{
    bool operator() (const string& left , const string& right )
    {
        return _stricmp( left.c_str() , right.c_str() ) < 0;
    }
};

int main(void)
{
    char* input[] = { "Foo" , "bar" , "Bar" , "FOO" };
    std::map<string, int , StrCaseInsensitive> CountMap;

    for( int i = 0 ; i < 4; ++i )
    {
        CountMap[ input[i] ] += 1;
    }
    return 0;
}
猫卆 2024-12-21 13:13:21

您可以使用map; >

关键是小写单词。该值是该单词的所有给定情况的向量。

(你也可以使用 multimap 这基本上是相同的,但我通常更喜欢向量图)

 map<string, vector<string> > m;
 m.size(); // number of lowercase words
 m["abc"].size(); // number of the given cases of the word "abc"

You can use map<string, vector<string> >.

The key is the lowercase word. The value is the vector of all the given cases of this word.

(you can also use multimap<string, string> which is basically the same, but I usually prefer a map of vectors)

 map<string, vector<string> > m;
 m.size(); // number of lowercase words
 m["abc"].size(); // number of the given cases of the word "abc"
分分钟 2024-12-21 13:13:21

您希望同一单词的不同大小写变体发生什么?

一种可能性是将 std::multiset无大小写比较器作为其Compare模板参数。在这种情况下,每个单词的所有变体都将保留在集合中。每个单词出现的次数可以通过 count() 成员函数获取的集合。

What do you want to happen with different case variants of the same word?

One possibility is to use std::multiset with a caseless comparator as its Compare template parameter. In this case, all variants of each word will be preserved in the set. Number of occurrences of each word can be obtained via count() member function of the set.

指尖上得阳光 2024-12-21 13:13:21

您可以使用结构体或 std::pair 来保留原始大小写和多次出现的情况。您的类型将如下所示: map <字符串,对<字符串,整数>; >

You can use a structure or std::pair to keep both the original case and a number of occurrences. Your type would then look like this: map < string, pair <string, int> >

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文