我正在计算每个单词在文本文件中出现的次数。我想避免出现这种情况,因此我会减少我的输入,然后进行计数。我有一个地图数据结构,其中有 string 和 int 来保持计数。现在,当我输出单词及其计数时,我不希望单词为小写,而是希望它保持其原始大小写。因此,为了计算所有单词应该更改为小写,但在输出时它们都应该是原来的大小写。有没有办法仅使用一张地图来实现这一目标?
I am counting the number of times every word occurs in a text file. I would like to avoid cases and hence am doing tolower to my input and then counting. I have a map data structure having string and int to keep count. Now, when I output the word and its count, I don't want the word to be in lower case, but want it to maintain its original case. So, for counting all the words should change to lowercase but while giving output they all should be in their original case. Is there anyway to achieve this with using only one map?
发布评论
评论(5)
std::map
的第三个模板参数是比较器类型。您可以提供自己的比较操作,在您的情况下是不区分大小写的。然后实例化你的地图:
注意:仅保留第一个拼写(这对你来说似乎没问题)
The third template parameter of
std::map
is a comparator type. You can provide your own comparison operation, in your case a case-insensitive one.Then instanciate your map:
Note: only the first spelling is preserved (which seems okay with you)
这应该有效。对于多种情况,第一种情况将在地图内,而不是小写。此外,该解决方案仅根据您的需要使用一张地图
This should work. For multiple cases the first case will be inside the map and not lower case. Also the solution uses only one map as you wanted
您可以使用
map; >
。关键是小写单词。该值是该单词的所有给定情况的向量。
(你也可以使用
multimap
这基本上是相同的,但我通常更喜欢向量图)You can use
map<string, vector<string> >
.The key is the lowercase word. The value is the vector of all the given cases of this word.
(you can also use
multimap<string, string>
which is basically the same, but I usually prefer a map of vectors)您希望同一单词的不同大小写变体发生什么?
一种可能性是将 std::multiset 与 无大小写比较器作为其
Compare
模板参数。在这种情况下,每个单词的所有变体都将保留在集合中。每个单词出现的次数可以通过 count() 成员函数获取的集合。What do you want to happen with different case variants of the same word?
One possibility is to use std::multiset with a caseless comparator as its
Compare
template parameter. In this case, all variants of each word will be preserved in the set. Number of occurrences of each word can be obtained via count() member function of the set.您可以使用结构体或 std::pair 来保留原始大小写和多次出现的情况。您的类型将如下所示:
map <字符串,对<字符串,整数>; >
You can use a structure or
std::pair
to keep both the original case and a number of occurrences. Your type would then look like this:map < string, pair <string, int> >