在向量中查找字符串中的字符

发布于 2024-08-29 05:30:47 字数 1308 浏览 7 评论 0原文

从标题来看，我的程序有点复杂。但！无论如何，我不妨问一下 xD

这是我为回答 Accelerated C++ 的问题 3-3 所做的一个简单程序，在我看来这是一本很棒的书。

我创建了一个向量：

vector<string> countEm;

它接受所有有效的字符串。因此，我有一个包含字符串元素的向量。

接下来，我创建了一个函数

int toLowerWords( vector<string> &vec )
{
    for( int loop = 0; loop < vec.size(); loop++ )
        transform( vec[loop].begin(), vec[loop].end(),
            vec[loop].begin(), ::tolower );

，将输入拆分为所有小写字符，以便于计数。到目前为止，一切都很好。

我创建了第三个也是最后一个函数来实际计算单词数，这就是我陷入困境的地方。

int counter( vector<string> &vec )
{

for( int loop = 0; loop < vec.size(); loop++ )
    for( int secLoop = 0; secLoop < vec[loop].size(); secLoop++ )
    {
        if( vec[loop][secLoop] == ' ' )

这看起来很荒谬。使用二维数组调用向量的字符，直到找到空格。荒谬的。我不认为这是一个优雅甚至可行的解决方案。如果这是一个可行的解决方案，我会从空间回溯并复制在单独的向量中找到的所有字符并计算这些字符。

那么我的问题是。如何将字符串向量分解为单独的单词，以便我可以实际对它们进行计数？我想过使用 strchr，但它并没有给我任何顿悟。

尼尔的解决方案：

stringstream ss( input );
while( ss >> buffer )
    countEm.push_back( buffer );

由此我可以轻松计算（重复出现的）单词。

然后我通过 Wilhelm 做了一个解决方案，我会在重写后发布它，因为我不小心删除了该解决方案！我真愚蠢，但是一旦我再次写完它，我就会发布它^^

我要感谢你们所有人的投入！这些解决方案奏效了，我成为了一名更好的程序员。如果我可以投票赞成你的东西，那么我会 :P 一旦可以，我会的！再次感谢！

原文

Judging from the title, I kinda did my program in a fairly complicated way. BUT! I might as well ask anyway xD

This is a simple program I did in response to question 3-3 of Accelerated C++, which is an awesome book in my opinion.

I created a vector:

vector<string> countEm;

That accepts all valid strings. Therefore, I have a vector that contains elements of strings.

Next, I created a function

int toLowerWords( vector<string> &vec )
{
    for( int loop = 0; loop < vec.size(); loop++ )
        transform( vec[loop].begin(), vec[loop].end(),
            vec[loop].begin(), ::tolower );

that splits the input into all lowercase characters for easier counting. So far, so good.

I created a third and final function to actually count the words, and that's where I'm stuck.

int counter( vector<string> &vec )
{

for( int loop = 0; loop < vec.size(); loop++ )
    for( int secLoop = 0; secLoop < vec[loop].size(); secLoop++ )
    {
        if( vec[loop][secLoop] == ' ' )

That just looks ridiculous. Using a two-dimensional array to call on the characters of the vector until I find a space. Ridiculous. I don't believe that this is an elegant or even viable solution. If it was a viable solution, I would then backtrack from the space and copy all characters I've found in a separate vector and count those.

My question then is. How can I dissect a vector of strings into separate words so that I can actually count them? I thought about using strchr, but it didn't give me any epiphanies.

Solution via Neil:

stringstream ss( input );
while( ss >> buffer )
    countEm.push_back( buffer );

From that I could easily count the (recurring) words.

Then I did a solution via Wilhelm that I will post once I re-write it since I accidentally deleted that solution! Stupid of me, but I will post that once I have it written again ^^

I want to thank all of you for your input! The solutions have worked and I became a little better programmer. If I could vote up your stuff, then I would :P Once I can, I will! And thanks again!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

您的好友蓝忘机已上羡 2024-09-05 05:30:47

如果单词始终以空格分隔，则拆分它们的最简单方法是使用字符串流：

string words = ....   // populat
istringstream is( words );

string word;
while( is >> word ) {
   cout << "word is " << word << endl;
}

当然，您需要编写一个函数来执行此操作，并将其应用于字符串。或者最好不要存储字符串，而是在初始输入时将其拆分为单词。

If the words are always space separated, the easiest way to split them is to use a stringstream:

string words = ....   // populat
istringstream is( words );

string word;
while( is >> word ) {
   cout << "word is " << word << endl;
}

You'd want to write a function to do this, of course, and apply it to your strings. Or it may be better not to store the strings at allm but to split into words on initial input.

回复收藏 0 原文

幸福不弃 2024-09-05 05:30:47

可以使用std::istringstream将单词一一提取并统计。但这种解决方案消耗了 O(n) 的空间复杂度。

string text("So many words!");
size_t count =  0;
for( size_t pos(text.find_first_not_of(" \t\n"));
    pos != string::npos;
    pos = text.find_first_not_of(" \t\n", text.find_first_of(" \t\n", ++pos)) )
    ++count;

也许不像尼尔的解决方案那么短，但除了已使用的空间之外，不占用任何空间和额外分配。

You can use std::istringstream to extract the words one by one and count them. But this solution consumes O(n) in space complexity.

string text("So many words!");
size_t count =  0;
for( size_t pos(text.find_first_not_of(" \t\n"));
    pos != string::npos;
    pos = text.find_first_not_of(" \t\n", text.find_first_of(" \t\n", ++pos)) )
    ++count;

Perhaps not as short as Neil's solution, but takes no space and extra-allocation other than what's already used.

回复收藏 0 原文

夜血缘 2024-09-05 05:30:47

使用标记生成器，例如此处列出的在第 7.3 节中，将向量中的字符串拆分为单个单词（或重写它，使其仅返回标记的数量）并循环遍历向量以计算遇到的标记总数。

回复收藏 0 原文

旧人 2024-09-05 05:30:47

从 C++11 开始，有一个特殊且非常强大的迭代器，用于迭代字符串中的模式（例如单词）： std::sregex_token_iterator

使用该迭代器函数 std::distance，我们可以通过计算第一个和最后一个模式之间的距离来简单地计算字符串中的所有单词（或其他模式）。

生成的程序始终是单行的：

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>

const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};

int main()
{
    std::cout << std::distance(std::sregex_token_iterator(test.begin(), test.end(), re), {});
}

使用此方法，我们当然也可以分割字符串并显示结果单词：

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>

const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};

int main()
{
    std::copy(std::sregex_token_iterator(test.begin(), test.end(), re), {}, std::ostream_iterator<std::string>(std::cout, "\n"));
}

通过使用 std::vector 范围构造函数，我们还可以将单词存储在 std::vector 中>：

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>
#include <vector>

const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};

int main()
{
    std::vector<std::string> words(std::sregex_token_iterator(test.begin(), test.end(), re), {});
    
    std::cout << words.size();
}

你看，

如果你有一个流，那么你可以使用 std::istream 迭代器来达到同样的目的 -

Since C++11 there is a special and very powerful iterator, for iterating over patterns (for example words) in a string: The std::sregex_token_iterator

With that and iterator function std::distance, we can simply count all words (or other patterns in a string, by calculating the distance between the first and the last pattern.

The resulting program is always a one-liner:

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>

const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};

int main()
{
    std::cout << std::distance(std::sregex_token_iterator(test.begin(), test.end(), re), {});
}

With this method, we can of course also split the string and show the resulting words:

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>

const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};

int main()
{
    std::copy(std::sregex_token_iterator(test.begin(), test.end(), re), {}, std::ostream_iterator<std::string>(std::cout, "\n"));
}

By using the std::vectors range constructor, we can store also the words in a std::vector:

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>
#include <vector>

const std::regex re{R"(\w+)"};
const std::string test{"the quick brown fox jumps over the lazy dog"};

int main()
{
    std::vector<std::string> words(std::sregex_token_iterator(test.begin(), test.end(), re), {});
    
    std::cout << words.size();
}

You see. There are really many possibilities.

If you have a stream, then you can use the std::istream iterator for the same purpose-

回复收藏 0 原文

~没有更多了~