C++：如何构建两个空格分隔字符串的交集字符串？

发布于 2024-11-28 22:37:11 字数 214 浏览 4 评论 0原文

我有两个空格分隔的字符串...（X 并不意味着相同的符号）

st1 = "abc def kok...."
st2 = "kok bbr def ffe ...."

我想构造一个交集字符串，如下所示： common = "kok def"

在 C++ 中执行此操作的有效方法是什么？

谢谢

原文

I have two space separated strings... (the X doesn't mean the same symbol)

st1 = "abc def kok...."
st2 = "kok bbr def ffe ...."

i would like to construct an intersection string as follows:
common = "kok def"

what is the efficient way to do so in c++?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

几度春秋 2024-12-05 22:37:11

使用 std::set_intersection

示例程序：

I'我假设您已经对您的字符串进行标记（这个解决方案似乎很容易实现）。

// Data
std::vector<string> a,b;
a.push_back("abc");b.push_back("kok");
a.push_back("def");b.push_back("bbr");
a.push_back("kok");b.push_back("def");
a.push_back("foo");b.push_back("ffe");

// Allocate space for intersection
std::vector<string> v(a.size()+b.size());

// Sort as required by set_intersection
std::sort(a.begin(),a.end());
std::sort(b.begin(),b.end());
// Compute
std::vector<string>::iterator it = std::set_intersection(a.begin(),a.end(),b.begin(),b.end(),v.begin());

// Display
v.erase(it,v.end());
for(std::vector<string>::iterator it = v.begin();it < v.end(); ++it) std::cout<<*it<<std::endl;

就令牌（或子字符串）的数量而言，复杂度应为O(n log n)。

Use std::set_intersection

Sample program:

I'm assuming you've tokenized your strings already (this solution seems easy to implement).

// Data
std::vector<string> a,b;
a.push_back("abc");b.push_back("kok");
a.push_back("def");b.push_back("bbr");
a.push_back("kok");b.push_back("def");
a.push_back("foo");b.push_back("ffe");

// Allocate space for intersection
std::vector<string> v(a.size()+b.size());

// Sort as required by set_intersection
std::sort(a.begin(),a.end());
std::sort(b.begin(),b.end());
// Compute
std::vector<string>::iterator it = std::set_intersection(a.begin(),a.end(),b.begin(),b.end(),v.begin());

// Display
v.erase(it,v.end());
for(std::vector<string>::iterator it = v.begin();it < v.end(); ++it) std::cout<<*it<<std::endl;

Complexity should be O(n log n) in the number of tokens (or sub-strings).

回复收藏 0 原文

拒绝两难 2024-12-05 22:37:11

将 st1 拆分为子字符串，并将它们全部放入 std::set
将 st2 拆分为子字符串，并检查它们是否存在于在步骤 1 中创建的集合。

这将提供 O(n log n) 执行时间。您必须将两个字符串恰好循环一次。对于每个元素，从集合中插入和检索的时间通常为 O(log n)，即 O(n log n)。

如果您可以使用具有 O(1) 插入和检索复杂度的基于哈希的集合（或其他一些无序集合），您将把复杂度降低到 O(n)。

回复收藏 0 原文

猛虎独行 2024-12-05 22:37:11

为了扩展您已经得到的答案，基本上有两个您没有指定的因素需要考虑。首先，如果输入中存在重复元素，您是否希望将这些元素考虑用于输出。例如，给定如下输入：

st1 = "kok abc def kok...."
st2 = "kok bbr kok def ffe ...."

由于“kok”在两个输入中出现两次，“kok”应该在输出中出现一次还是两次？

第二个是你的使用模式。您是否有一种读取所有输入，然后生成单个输出的模式，或者是一种更具迭代性的模式，您可以读取一些输入，生成一个输出，读取添加到前一个输入的更多输入，生成另一个输出，等等在？

如果您要读取所有输入，然后生成一个输出，您可能需要使用 std::vector 后跟 std::sort。如果您只希望每个输入在输出中只出现一次，无论它在两个输入中出现的频率如何，那么您将遵循 std::unique ，最后执行您的 设置交集。

如果您想支持迭代更新，那么您可能需要使用 std::set 或 std::multiset （std::set 用于每个输出都是唯一的，std::multiset（如果重复输入应该给出重复的结果）。

编辑：基于输入中缺乏重复，一个非常快速简单的实现将类似于：

#include <string>
#include <set>
#include <algorithm>
#include <iterator>
#include <sstream>
#include <iostream>

int main() {   
    std::string st1("abc def kok");
    std::string st2("kok bbr def ffe");

    std::istringstream s1(st1);
    std::istringstream s2(st2);

    // Initialize stringstreams. Whine about most vexing parse.
    std::set<std::string> words1((std::istream_iterator<std::string>(s1)), 
                                 std::istream_iterator<std::string>());

    std::set<std::string> words2((std::istream_iterator<std::string>(s2)), 
                                 std::istream_iterator<std::string>());

    std::ostringstream common;

    // put the intersection into common:
    std::set_intersection(words1.begin(), words1.end(), 
                          words2.begin(), words2.end(),
                          std::ostream_iterator<std::string>(common, " "));

    std::cout << common.str();  // show the result.
    return 0;
}

To expand a bit on the answers you've already gotten, there are basically two factors to consider that you haven't specified. First, if there are duplicate elements in the input, do you want those considered for the output. For example, given input like:

st1 = "kok abc def kok...."
st2 = "kok bbr kok def ffe ...."

Since "kok" appears twice in both inputs, should "kok" appear once in the output or twice?

The second is your usage pattern. Do you have a pattern of reading all the input, then generating a single output, or is a more iterative, where you might read some input, generate an output, read some more input that's added to the previous, generate another output, and so on?

If you're going read all the input, then generate one output, you probably want to use std::vector followed by std::sort. If you only want each input to appear only once in the output, regardless of how often it appears in both inputs, then you'd follow that by std::unique, and finally do your set_intersection.

If you want to support iterative updates, then you probably want to use std::set or std::multiset (std::set for each output to be unique, std::multiset if repeated inputs should give repeated results).

Edit: based on the lack of duplication in the input, a really quick simple implementation would be something like:

#include <string>
#include <set>
#include <algorithm>
#include <iterator>
#include <sstream>
#include <iostream>

int main() {   
    std::string st1("abc def kok");
    std::string st2("kok bbr def ffe");

    std::istringstream s1(st1);
    std::istringstream s2(st2);

    // Initialize stringstreams. Whine about most vexing parse.
    std::set<std::string> words1((std::istream_iterator<std::string>(s1)), 
                                 std::istream_iterator<std::string>());

    std::set<std::string> words2((std::istream_iterator<std::string>(s2)), 
                                 std::istream_iterator<std::string>());

    std::ostringstream common;

    // put the intersection into common:
    std::set_intersection(words1.begin(), words1.end(), 
                          words2.begin(), words2.end(),
                          std::ostream_iterator<std::string>(common, " "));

    std::cout << common.str();  // show the result.
    return 0;
}

回复收藏 0 原文

~没有更多了~