STL算法的可组合性

发布于 2024-11-24 18:07:47 字数 1890 浏览 2 评论 0 原文

STL 算法在 C++ 中非常有用。但让我恼火的一件事是它们似乎缺乏可组合性。

例如,假设我有一个 vector> 并且希望将其转换为仅包含 第二个的 vector 该对的成员。这很简单:

std::vector<std::pair<int, int>> values = GetValues();
std::vector<int> result;

std::transform(values.begin(), values.end(), std::back_inserter(result),
    [] (std::pair<int, int> p) { return p.second; });

或者也许我想过滤向量以仅查找first成员为偶数的那些对。也很简单:

std::vector<std::pair<int, int>> values = GetValues();
std::vector<std::pair<int, int>> result;

std::copy_if(values.begin(), values.end(), std::back_inserter(result),
    [] (std::pair<int, int> p) { return (p.first % 2) == 0; });

但是如果我想两者都做怎么办?没有 transform_if 算法,并且同时使用 transformcopy_if 似乎需要分配一个临时 vector 来保存中间结果:

std::vector<std::pair<int, int>> values = GetValues();
std::vector<std::pair<int, int>> temp;
std::vector<int> result;

std::copy_if(values.begin(), values.end(), std::back_inserter(temp),
    [] (std::pair<int, int> p) { return (p.first % 2) == 0; });

std::transform(values.begin(), values.end(), std::back_inserter(result),
    [] (std::pair<int, int> p) { return p.second; });

这对我来说似乎相当浪费。我能想到的避免临时向量的唯一方法是放弃 transformcopy_if 并简单地使用 for_each (或常规的 for 循环,无论哪个适合你的喜好):

std::vector<std::pair<int, int>> values = GetValues();
std::vector<int> result;

std::for_each(values.begin(), values.end(),
    [&result] (std::pair<int, int> p) 
        { if( (p.first % 2) == 0 ) result.push_back(p.second); });

我在这里错过了什么吗?有没有一种好方法可以将两个现有的STL算法组合成一个新的算法,而不需要临时存储?

The STL algorithms are a pretty useful thing in C++. But one thing that kind of irks me is that they seem to lack composability.

For example, let's say I have a vector<pair<int, int>> and want to transform that to a vector<int> containing only the second member of the pair. That's simple enough:

std::vector<std::pair<int, int>> values = GetValues();
std::vector<int> result;

std::transform(values.begin(), values.end(), std::back_inserter(result),
    [] (std::pair<int, int> p) { return p.second; });

Or maybe I want to filter the vector for only those pairs whose first member is even. Also pretty simple:

std::vector<std::pair<int, int>> values = GetValues();
std::vector<std::pair<int, int>> result;

std::copy_if(values.begin(), values.end(), std::back_inserter(result),
    [] (std::pair<int, int> p) { return (p.first % 2) == 0; });

But what if I want to do both? There is no transform_if algorithm, and using both transform and copy_if seems to require allocating a temporary vector to hold the intermediate result:

std::vector<std::pair<int, int>> values = GetValues();
std::vector<std::pair<int, int>> temp;
std::vector<int> result;

std::copy_if(values.begin(), values.end(), std::back_inserter(temp),
    [] (std::pair<int, int> p) { return (p.first % 2) == 0; });

std::transform(values.begin(), values.end(), std::back_inserter(result),
    [] (std::pair<int, int> p) { return p.second; });

This seems rather wasteful to me. The only way I can think of to avoid the temporary vector is to abandon transform and copy_if and simply use for_each (or a regular for loop, whichever suits your fancy):

std::vector<std::pair<int, int>> values = GetValues();
std::vector<int> result;

std::for_each(values.begin(), values.end(),
    [&result] (std::pair<int, int> p) 
        { if( (p.first % 2) == 0 ) result.push_back(p.second); });

Am I missing something here? Is there a good way to compose two existing STL algorithms into a new one without needing temporary storage?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

世界如花海般美丽 2024-12-01 18:07:48

早在 2000 年,这个问题就已经被注意到。 Gary Powell 和 Martin Weiser 提出了“视图”的概念,并创造了“视图模板库”这个名称。当时它并没有流行起来,但这个想法是有道理的。 “视图”适配器本质上应用了动态转换。例如,它可以调整value_type

现在我们有了 C++0x,这个概念可能应该重新讨论。自 2000 年以来,我们在泛型编程方面取得了相当大的进展。

例如,让我们使用 vector>vector 示例。这可能非常简单:

std::vector<std::pair<int, int>> values = GetValues();
vtl2::view v (values, [](std::pair<int, int> p) { return p.first }); 
std::vector<int> result(view.begin(), view.end());

或者,使用 boost::bind 技术,甚至更简单:

std::vector<std::pair<int, int>> values = GetValues();
vtl2::view v (values, &std::pair<int, int>::first); 
std::vector<int> result(view.begin(), view.end());

Back in 2000, the problem was already noted. Gary Powell and Martin Weiser came up with a "view" concept, and coined the name "View Template Library". It didn't take off then but the idea makes sense. A "view" adaptor essentially applies an on-the-fly transform. For instance, it can adapt the value_type.

The concept probably should be readdressed now we have C++0x. We've made quite some progress in generic programming since 2000.

For example, let's use the vector<pair<int, int>> to vector<int> example. That could be quite simple:

std::vector<std::pair<int, int>> values = GetValues();
vtl2::view v (values, [](std::pair<int, int> p) { return p.first }); 
std::vector<int> result(view.begin(), view.end());

Or, using the boost::bind techniques, even simpler:

std::vector<std::pair<int, int>> values = GetValues();
vtl2::view v (values, &std::pair<int, int>::first); 
std::vector<int> result(view.begin(), view.end());
缺⑴份安定 2024-12-01 18:07:48

C++20 开始,您可以使用 std::ranges::copy 与范围适配器 std::views::filterstd::views::values 来自 范围库如下:

int main() {
    std::vector<std::pair<int, int>> values = { {1,2}, {4,5}, {6,7}, {9,10} };
    std::vector<int> result;

    auto even = [](const auto& p) { return (p.first % 2) == 0; };
    std::ranges::copy(values | std::views::filter(even) | std::views::values,
                      std::back_inserter(result));

    for (int i : result)
        std::cout << i << std::endl;

    return 0;
}

输出:

5
7

在上面的解决方案中,没有为中间结果创建临时向量,因为视图适配器创建不包含元素的范围。这些范围只是输入向量的视图,但具有自定义的迭代行为。

Wandbox 上的代码

Since C++20 you can use std::ranges::copy together with the range adaptors std::views::filter and std::views::values from the Ranges library as follows:

int main() {
    std::vector<std::pair<int, int>> values = { {1,2}, {4,5}, {6,7}, {9,10} };
    std::vector<int> result;

    auto even = [](const auto& p) { return (p.first % 2) == 0; };
    std::ranges::copy(values | std::views::filter(even) | std::views::values,
                      std::back_inserter(result));

    for (int i : result)
        std::cout << i << std::endl;

    return 0;
}

Output:

5
7

In the solution above, no temporary vector is created for an intermediate result, because the view adaptors create ranges that don't contain elements. These ranges are just views over the input vector, but with a customized iteration behavior.

Code on Wandbox

吃不饱 2024-12-01 18:07:48

不确定这是否仍然有效,但是......
一个新的仅轻型等待标头的库,可以执行您所描述的操作。 Doc 谈论了惰性求值和 com 可组合生成器。

文档片段:

  • 从文件“test.txt”中读取最多 10 个整数。
  • 过滤偶数,对它们求平方并对它们的值求和。
    int total = lz::read<int>(ifstream("test.txt")) | lz::limit(10) |
                lz::filter([](int i) { return i % 2 == 0; }) |
                lz::map([](int i) { return i *  i; }) | lz::sum();

您可以将该行拆分为多个表达式。

    auto numbers = lz::read<int>(ifstream("test.txt")) | lz::limit(10);
    auto evenFilter = numbers | lz::filter([](int i) { return i % 2 == 0; });
    auto squares = evenFilter | lz::map([](int i) { return i *  i; });
    int total = squares | lz::sum();
  • 尽管这个表达式被分割为多个变量赋值,但它的效率并没有降低。
  • 每个中间变量简单地
    描述要执行的代码单元。全部放在堆栈中。

https://github.com/SaadAttieh/lazyCode

Not sure if this is still active, but...
A new light wait header only lib that does what you describe. Doc talks about lazy evaluation and com compossible generators.

Doc snippet:

  • Read in up to 10 integers from a file "test.txt".
  • filter for the even numbers, square them and sum their values.
    int total = lz::read<int>(ifstream("test.txt")) | lz::limit(10) |
                lz::filter([](int i) { return i % 2 == 0; }) |
                lz::map([](int i) { return i *  i; }) | lz::sum();

you can split that line up into multiple expressions.

    auto numbers = lz::read<int>(ifstream("test.txt")) | lz::limit(10);
    auto evenFilter = numbers | lz::filter([](int i) { return i % 2 == 0; });
    auto squares = evenFilter | lz::map([](int i) { return i *  i; });
    int total = squares | lz::sum();
  • Even though this expression is split over multiple variable assignments, it is not any less efficient.
  • Each intermediate variable simply
    describes a unit of code to be executed. All held in stack.

https://github.com/SaadAttieh/lazyCode

套路撩心 2024-12-01 18:07:47

你说得对。您可以使用 Boost。范围适配器来实现组合。

You're right. You can use Boost.Range adaptors to achieve composition.

骄傲 2024-12-01 18:07:47

我认为问题是不幸的是,结构

  1. C++使用两个迭代器来表示序列
  2. C++函数是单值的,

所以你不能链接它们,因为函数不能返回“序列”。

一种选择是使用单对象序列(喜欢 boost 的范围方法)。通过这种方式,您可以将一个处理的结果合并为另一个处理的输入...(一个对象 -> 一个对象)。

在标准 C++ 库中,处理过程是(两个对象 -> 一个对象),很明显,如果不命名临时对象,就无法链接它。

I think the problem is unfortunately structural

  1. C++ uses two iterators to represent a sequence
  2. C++ functions are single-valued

so you cannot chain them because a function cannot return "a sequence".

An option would have been to use single-object sequences instead (like the range approach from boost). This way you could have combined the result of one processing as the input of another... (one object -> one object).

In the standard C++ library instead the processing is (two objects -> one object) and it's clear that this cannot be chained without naming the temporary object.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文