需要 C++ 方面的帮助使用映射来跟踪 INPUT 文件中的单词
假设我有一个文本文件,
today is today but
tomorrow is today tomorrow
然后使用地图,我如何跟踪重复的单词?它在哪一行重复? 到目前为止,我已将文件中的每个字符串作为临时值读取,并以以下方式存储:
map<string,int> storage;
int count = 1 // for the first line of the file
if(infile.is_open()){
while( !infile.eof() ){
getline(in, line);
istringstream my_string(line);
while(my_string.good()){
string temp;
my_string >> temp;
storage[temp] = count
}
count++;// so that every string read in the next line will be recorded as that line.
}
}
map<string,int>::iterator m;
for(int m = storage.begin(); m!= storage.end(); m++){
out<<m->first<<": "<<"line "<<m->second<<endl;
}
现在的输出只是
but: line 1
is: line 2
today: line 2
tomorrow: line 2
But 相反.. 它应该打印出(没有重复的字符串):
today : line 1 occurred 2 times, line 2 occurred 1 time.
is: line 1 occurred 1 time, line 2 occurred 1 time.
but: line 1 occurred 1 time.
tomorrow: line 2 occurred 2 times.
注意:字符串的顺序并不重要。
任何帮助将不胜感激。谢谢。
Let say i have a text file with
today is today but
tomorrow is today tomorrow
then using maps how can i keep track of the words that are repeated? and on which line it repeats?
so far i have each string in the file read in as a temp and it is stored in the following way:
map<string,int> storage;
int count = 1 // for the first line of the file
if(infile.is_open()){
while( !infile.eof() ){
getline(in, line);
istringstream my_string(line);
while(my_string.good()){
string temp;
my_string >> temp;
storage[temp] = count
}
count++;// so that every string read in the next line will be recorded as that line.
}
}
map<string,int>::iterator m;
for(int m = storage.begin(); m!= storage.end(); m++){
out<<m->first<<": "<<"line "<<m->second<<endl;
}
right now the output is just
but: line 1
is: line 2
today: line 2
tomorrow: line 2
But instead..
it should print out(no repeating strings):
today : line 1 occurred 2 times, line 2 occurred 1 time.
is: line 1 occurred 1 time, line 2 occurred 1 time.
but: line 1 occurred 1 time.
tomorrow: line 2 occurred 2 times.
Note: the order of the string does not matter.
Any help would be appreciated. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
map 存储具有唯一键的(键,值)对。这意味着如果您多次分配给同一个键,则只会存储您分配的最后一个值。
听起来您想要做的不是将行存储为值,而是希望存储另一个行->出现次数的地图。
所以你可以像这样制作你的地图:
然后插入:
map stores a (key, value) pair with a unique key. Meaning that if you assign to the same key more than once, only the last value that you assigned will be stored.
Sounds like what you want to do is instead of storing the line as the value, you want to store another map of lines->occurances.
So you could make your map like this:
Then to insert:
当您只在其中存储 1 项信息时,您试图从集合中获取 2 项信息。
扩展当前实现的最简单方法是存储结构而不是 int。
因此,
您可以这样做:
在映射定义的地方:
使用以下方式打印结果:
编辑: 使用 typedef 进行定义,例如:
then
MYMAP::iterator iter;
you're trying to get 2 items of information out of the collection, when you only store 1 item of information in there.
The easiest way to extend your current implementation is to store a struct instead of an int.
So instead of:
you'd do:
where the map is defined:
print the results using:
edit: use a typedef for the definitions, eg:
then
MYMAP::iterator iter;
您的存储数据类型不足以存储您要报告的所有信息。您可以通过使用向量进行计数存储来实现这一目标,但是您必须进行大量的簿记工作,以确保在未遇到单词时实际插入 0,并在遇到新单词时创建具有正确大小的向量遇到。这不是一项微不足道的任务。
您可以将计数部分切换为数字映射,第一个是行,第二个是计数......这会降低代码的复杂性,但并不是最有效的方法。
无论如何,您不能仅使用 std::map
编辑需要做的事情:只是想到了一个更容易生成但更难报告的替代版本: std::vector< std::map>。对于文件中的每个新行,您将生成一个新的 map;并将其推到向量上。您可以创建一个辅助类型 set包含文件中出现的所有单词以供您的报告使用。
无论如何,这可能就是我要做的事情,除非我将所有这些废话封装在一个类中,这样我就可以做类似的事情:
Your storage data type is insufficient to store all the information you want to report. You could get there by using a vector for count storage but you'd have to do a lot of book-keeping to make sure you actually insert a 0 when a word is not encountered and create the vector with the right size when a new word is encountered. Not a trivial task.
You could switch your count part to a map of numbers, first being line and second being count... That would reduce the complexity of your code but wouldn't exactly be the most efficient method.
At any rate, you can't do what you need to do with just a std::map
Edit: just thought of an alternative version that would be easier to generate but harder to report with: std::vector< std::map<std::string, unsigned int> >. For each new line in a file you'd generate a new map<string,int> and push it onto the vector. You could create a helper type set<string> to contain all the words that appear in a file to use in your reporting.
That's probably how I'd do it anyway except I'd encapsulate all that crap in a class so that I'd just do something like:
除此之外,你的循环都是错误的。您不应该永远在 eof 或良好标志上循环,而应该在读操作成功时循环。你想要这样的东西:
Apart from anything else, your loops are all wrong. You should never loop on the eof or good flags, but on the success of the read operation. You want something like: