`std::set` 有什么问题吗?
在另一个主题中,我试图解决这个问题。问题是从 std::string
中删除重复的字符。
std::string s= "saaangeetha";
由于顺序并不重要,所以我先对 s
进行排序,然后使用 std::unique
最后调整大小以获得 所需结果:
aeghnst
这是正确的!
现在我想做同样的事情,但同时我希望字符的顺序保持不变。意思是,我想要这个输出:
sangeth
所以我写了 this:
template<typename T>
struct is_repeated
{
std::set<T> unique;
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
s.erase(std::remove_if(s.begin(), s.end(), is_repeated<char>()), s.end());
std::cout << s ;
}
给出了这个输出:
saangeth
也就是说, a
被重复,尽管其他重复消失了。代码有什么问题吗?
无论如何,我稍微更改了我的代码:(请参阅评论)
template<typename T>
struct is_repeated
{
std::set<T> & unique; //made reference!
is_repeated(std::set<T> &s) : unique(s) {} //added line!
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
std::set<char> set; //added line!
s.erase(std::remove_if(s.begin(),s.end(),is_repeated<char>(set)),s.end());
std::cout << s ;
}
输出:
sangeth
问题消失了!
那么第一个解决方案有什么问题呢?
另外,如果我没有将成员变量设为unique
引用类型,那么问题就不会出现去。
std::set
或 is_repeated
函子有什么问题?到底问题出在哪里呢?
我还注意到,如果将 is_repeated
函子复制到某处,那么它的每个成员也会被复制。我没有看到这里的问题!
In the other topic I was trying to solve this problem. The problem was to remove duplicate characters from a std::string
.
std::string s= "saaangeetha";
Since the order was not important, so I sorted s
first, and then used std::unique
and finally resized it to get the desired result:
aeghnst
That is correct!
Now I want to do the same, but at the same time I want the order of characters intact. Means, I want this output:
sangeth
So I wrote this:
template<typename T>
struct is_repeated
{
std::set<T> unique;
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
s.erase(std::remove_if(s.begin(), s.end(), is_repeated<char>()), s.end());
std::cout << s ;
}
Which gives this output:
saangeth
That is, a
is repeated, though other repetitions gone. What is wrong with the code?
Anyway I change my code a bit: (see the comment)
template<typename T>
struct is_repeated
{
std::set<T> & unique; //made reference!
is_repeated(std::set<T> &s) : unique(s) {} //added line!
bool operator()(T c) { return !unique.insert(c).second; }
};
int main() {
std::string s= "saaangeetha";
std::set<char> set; //added line!
s.erase(std::remove_if(s.begin(),s.end(),is_repeated<char>(set)),s.end());
std::cout << s ;
}
Output:
sangeth
Problem gone!
So what is wrong with the first solution?
Also, if I don't make the member variable unique
reference type, then the problem doesn't go.
What is wrong with std::set
or is_repeated
functor? Where exactly is the problem?
I also note that if the is_repeated
functor is copied somewhere, then every member of it is also copied. I don't see the problem here!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
函子类应该是纯函数并且没有自己的状态。请参阅 Scott Meyer 的 Effective STL 书中的第 39 项以获得对此的详细解释。但其要点是,您的函子类可能会在算法内被复制 1 次或多次。
Functor classes should be pure functions and have no state of their own. See item 39 in Scott Meyer's Effective STL book for a good explanation on this. But the gist of it is that your functor class may be copied 1 or more times inside the algorithm.
其他答案是正确的,因为问题是您使用的函子不是可复制安全的。特别是,gcc (4.2) 附带的 STL 将
std::remove_if
实现为std::find_if
的组合,以定位要删除的第一个元素,后跟 < code>std::remove_copy_if 来完成操作。[1] 中的副本意味着找到的第一个元素被添加到函子的副本中,这意味着第一个“a”将被遗忘。函子也在 [2] 中被复制,如果不是的话也没关系,因为该副本的原始函子是一个空函子。
The other answers are correct, in that the issue is that the functor that you are using is not copyable safe. In particular, the STL that comes with gcc (4.2) implements
std::remove_if
as a combination ofstd::find_if
to locate the first element to delete followed by astd::remove_copy_if
to complete the operation.The copy in [1] means that the first element found is added to the copy of the functor and that means that the first 'a' will be lost in oblivion. The functor is also copied in [2], and that would be fine if it were not because the original for that copy is an empty functor.
根据
remove_if
的实现,可以复制您的谓词。重构函子并使其无状态,或者使用 Boost.Ref 到“用于传递对函数模板(算法)的引用,这些函数模板(算法)通常会获取其参数的副本”,如下所示:Depending on the implementation of
remove_if
can make copies of your predicate. Either refactor your functor and make it stateless or use Boost.Ref to "for passing references to function templates (algorithms) that would usually take copies of their arguments", like so:函子的设计方式应该使函子的副本与原始函子相同。也就是说,如果您复制一个函子,然后执行一系列操作,则无论您使用哪个函子,或者即使您交错两个函子,结果都应该是相同的。这使得 STL 实现能够灵活地复制函子并根据需要传递它们。
对于您的第一个函子,此声明不成立,因为如果我复制您的函子然后调用它,您对其存储集所做的更改不会反映在原始函子中,因此副本和原始函子的性能会有所不同。同样,如果您采用第二个函子并使其不通过引用存储其集合,则函子的两个副本的行为将不相同。
不过,函子的最终版本之所以有效,是因为集合是通过引用存储的这一事实意味着任何数量的 tue 函子副本都将彼此表现相同。
希望这有帮助!
Functors are supposed to be designed in a way where a copy of a functor is identical to the original functor. That is, if you make a copy of one functor and then perform a sequence of operations, the result should be the same no matter which functor you use, or even if you interleave the two functors. This gives the STL implementation the flexibility to copy functors and pass them around as it sees fit.
With your first functor, this claim does not hold because if I copy your functor and then call it, the changes you make to its stored set do not reflect in the original functor, so the copy and the original will perform differently. Similarly, if you take your second functor and make it not store its set by reference, the two copies of the functor will not behave identically.
The reason that your final version of the functor works, though, is because the fact that the set is stored by reference means that any number of copies of tue functor will behave identically to one another.
Hope this helps!
在 GCC (libstdc++) 中,
remove_if< /code> 本质上实现为
注意,您的谓词按值传递给
find_if
,因此结构以及集合在find_if
内进行修改code> 不会传播回调用者。由于第一个重复项出现在:
初始的“sa”将在
find_if
调用后保留。同时,谓词集为空(find_if
中的插入是本地的)。因此之后的循环将保留第三个a
。In GCC (libstdc++),
remove_if
is implemented essentially asNote that your predicate is passed by-value to
find_if
, so the struct, and therefore the set, modified insidefind_if
will not be propagated back to caller.Since the first duplicate appears at:
The initial
"sa"
will be kept after thefind_if
call. Meanwhile, thepredicate
's set is empty (the insertions withinfind_if
are local). Therefore the loop afterwards will keep the 3rda
.并不是真正的答案,但作为另一个需要考虑的有趣的花絮,这确实有效,即使它使用原始函子:
编辑:我不认为它会影响此行为,但我还纠正了函子中的一个小错误(显然,operator() 应该采用 T 类型的参数,而不是
char
)。Not really an answer, but as another interesting tidbit to consider, this does work, even though it uses the original functor:
Edit: I don't think it affects this behavior, but I've also corrected a minor slip in your functor (operator() should apparently take a parameter of type T, not
char
).我认为问题可能在于
is_repeated
函子被复制到std::remove_if
实现内部的某个位置。如果是这种情况,则使用默认的复制构造函数,然后调用std::set
复制构造函数。您最终会得到两个可能独立使用的is_repeated
函子。然而,由于它们中的集合是不同的对象,因此它们看不到相互的变化。如果将字段is_repeated::unique
转换为引用,则复制的函子仍使用原始集,这就是本例中您想要的。I suppose the problem could lie in that the
is_repeated
functor is copied somewhere inside the implementation ofstd::remove_if
. If that is the case, the default copy constructor is used and this in turn callsstd::set
copy constructor. You end up with twois_repeated
functors possibly used independently. However as the sets in both of them are distinct objects, they don't see the mutual changes. If you turn the fieldis_repeated::unique
to a reference, then the copied functor still uses the original set which is what you want in this case.