通过组合或自由函数扩展STL容器？

发布于 2024-10-31 17:05:47 字数 893 浏览 0 评论 0原文

假设我的应用程序中需要一个新类型，它由一个由单个函数扩展的 std::vector 组成。最直接的方法是组合（由于 STL 容器继承的限制）：

class A {
    public:
        A(std::vector<int> & vec) : vec_(vec) {}
        int hash();
    private:
        std::vector<int> vec_
}

这要求用户首先构造一个 vector 和构造函数中的副本，这在我们进行操作时很糟糕处理大量的大向量。当然，可以编写一个到 push_back() 的传递，但这会引入可变状态，我想避免这种情况。

所以在我看来，我们可以避免复制或保持 A 不可变，这是正确的吗？

如果是这样，最简单的（在效率方面等效）方法是在命名空间范围内使用 typedef 和自由函数：

namespace N {
typedef std::vector<int> A;
int a_hash(const A & a);
}

这在某种程度上感觉是错误的，因为将来的扩展会“污染”命名空间。此外，在任何 vector 上调用 a_hash(...) 是可能的，这可能会导致意外结果（假设我们对 A 施加约束，用户必须遵循或否则将在第一个示例中强制执行）

我的两个问题是：

在使用上述类代码时如何才能不牺牲不变性和效率？
什么时候使用自由函数而不是封装在类/结构中才有意义？

谢谢你！

原文

Say I need a new type in my application, that consists of a std::vector<int> extended by a single function. The straightforward way would be composition (due to limitations in inheritance of STL containers):

class A {
    public:
        A(std::vector<int> & vec) : vec_(vec) {}
        int hash();
    private:
        std::vector<int> vec_
}

This requires the user to first construct a vector<int> and a copy in the constructor, which is bad when we are going to handle a sizeable number of large vectors. One could, of course, write a pass-through to push_back(), but this introduces mutable state, which I would like to avoid.

So it seems to me, that we can either avoid copies or keep A immutable, is this correct?

If so, the simplest (and efficiency-wise equivalent) way would be to use a typedef and free functions at namespace scope:

namespace N {
typedef std::vector<int> A;
int a_hash(const A & a);
}

This just feels wrong somehow, since extensions in the future will "pollute" the namespace. Also, calling a_hash(...) on any vector<int> is possible, which might lead to unexpected results (assuming that we impose constraints on A the user has to follow or that would otherwise be enforced in the first example)

My two questions are:

how can one not sacrifice both immutability and efficiency when using the above class code?
when does it make sense to use free functions as opposed to encapsulation in classes/structs?

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦里的微风 2024-11-07 17:05:47

散列是一种算法而不是一种类型，并且可能也不应该局限于任何特定容器类型中的数据。如果您想提供散列，那么创建一个函子来一次计算一个元素（int，正如您上面所写的那样）的散列可能是最有意义的，然后使用 std::accumulate 或 std::for_each 将其应用于集合：

namespace whatever { 
struct hasher { 
    int current_hash;
public:
    hasher() : current_hash(0x1234) {}

    // incredibly simplistic hash: just XOR the values together.
    operator()(int new_val) { current_hash ^= new_val; }
    operator int() { return current_hash; }
};
}

int hash = std::for_each(coll.begin(), coll.end(), whatever::hasher());

请注意，这允许 coll 成为 vector ，或一个deque，或者您可以使用一对istream_iterators来散列文件中的数据...

Hashing is an algorithm not a type, and probably shouldn't be restricted to data in any particular container type either. If you want to provide hashing, it probably makes the most sense to create a functor that computes a hash one element (int, as you've written things above) at a time, then use std::accumulate or std::for_each to apply that to a collection:

namespace whatever { 
struct hasher { 
    int current_hash;
public:
    hasher() : current_hash(0x1234) {}

    // incredibly simplistic hash: just XOR the values together.
    operator()(int new_val) { current_hash ^= new_val; }
    operator int() { return current_hash; }
};
}

int hash = std::for_each(coll.begin(), coll.end(), whatever::hasher());

Note that this allows coll to be a vector, or a deque or you can use a pair of istream_iterators to hash data in a file...

回复收藏 0 原文

三生路 2024-11-07 17:05:47

广告不可变：您可以使用向量的范围构造函数并创建一个输入迭代器来提供向量的内容。范围构造函数只是：

template <typename I>
A::A(I const &begin, I const &end) : vec_(begin, end) {}

生成器有点棘手。如果您现在有一个使用 push_back 构造向量的循环，则需要进行大量重写才能转换为从方法一次返回一项的对象。您需要将对它的引用包装在有效的输入迭代器中。

无广告函数：由于重载，污染命名空间通常不是问题，因为该符号只会被考虑用于具有特定参数类型的调用。

自由函数也使用依赖于参数的查找。这意味着函数应该放置在类所在的命名空间中。例如：

#include <vector>
namespace std {
    int hash(vector<int> const &vec) { /*...*/ }
}
//...
std::vector<int> v;
//...
hash(v);

现在您仍然可以不合格地调用 hash ，但不会将其用于任何其他目的，除非您使用命名空间std （我个人几乎从不这样做，要么只使用 std:: 前缀，要么使用 using std::vector 来获取我想要的符号）。不幸的是，我不确定依赖于名称空间的查找如何与另一个名称空间中的 typedef 一起使用。

在许多模板算法中，通常使用自由函数（并且具有相当通用的名称）来代替方法，因为它们可以添加到现有类中，可以为原始类型定义，或者两者兼而有之。

Ad immutable: You could use the range constructor of vector and create an input iterator to provide the content for the vector. The range constructor is just:

template <typename I>
A::A(I const &begin, I const &end) : vec_(begin, end) {}

The generator is a bit more tricky. If you now have a loop that constructs a vector using push_back, it takes quite a bit of rewriting to convert to object that returns one item at a time from a method. Than you need to wrap a reference to it in a valid input iterator.

Ad free functions: Due to overloading, polluting the namespace is usually not a problem, because the symbol will only be considered for a call with the specific argument type.

Also free functions use the argument-dependent lookup. That means the function should be placed in the namespace the class is in. Like:

#include <vector>
namespace std {
    int hash(vector<int> const &vec) { /*...*/ }
}
//...
std::vector<int> v;
//...
hash(v);

Now you can still call hash unqualified, but don't see it for any other purpose unless you do using namespace std (I personally almost never do that and either just use the std:: prefix or do using std::vector to get just the symbol I want). Unfortunately I am not sure how the namespace-dependent lookup works with typedef in another namespace.

In many template algorithms, free functions—and with fairly generic names—are often used instead of methods, because they can be added to existing classes, can be defined for primitive types or both.

回复收藏 0 原文

黑白记忆 2024-11-07 17:05:47

一种简单的解决方案是将私有成员变量声明为引用 &在构造函数中初始化。这种方法带来了一些限制，但在大多数情况下它是一个不错的选择。

class A {
    public:
        A(std::vector<int> & vec) : vec_(vec) {}
        int hash();
    private:
        std::vector<int> &vec_; // 'vec_' now a reference, so will be same scoped as 'vec'
};

One simple solution is to declare the private member variable as reference & initialize in constructor. This approach introduces some limitation, but it's a good alternative in most cases.

class A {
    public:
        A(std::vector<int> & vec) : vec_(vec) {}
        int hash();
    private:
        std::vector<int> &vec_; // 'vec_' now a reference, so will be same scoped as 'vec'
};

回复收藏 0 原文

~没有更多了~