当前位置：文江博客话题详情

C++ valarray 与向量

发布于 2024-08-08 16:45:48 字数 110 浏览 5 评论 0原文

我非常喜欢矢量。他们很聪明而且速度很快。但我知道这个叫做 valarray 的东西存在。为什么我要使用 valarray 而不是向量？我知道 valarrays 有一些语法糖，但除此之外，它们什么时候有用？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

十秒萌定你 2024-08-15 16:45:49

C++11 标准规定：

valarray 数组类被定义为不受某些形式的影响
别名，从而允许对这些类的操作进行优化。

请参阅 C++11 26.6.1-2。

回复收藏 0 原文

我爱人 2024-08-15 16:45:49

通过 std::valarray，您可以使用标准数学符号，例如开箱即用的 v1 = a*v2 + v3。这对于向量来说是不可能的，除非您定义自己的运算符。

回复收藏 0 原文

慕巷 2024-08-15 16:45:49

std::valarray 适用于繁重的数值任务，例如计算流体动力学或计算结构动力学，其中您拥有包含数百万个、有时数千万个项目的数组，并且您可以在具有数百万个时间步长的循环中迭代它们。也许今天 std::vector 具有相当的性能，但大约 15 年前，如果您想编写一个高效的数值求解器，valarray 几乎是必需的。

回复收藏 0 原文

梦回梦里 2024-08-15 16:45:49

基本上，std::valarray 是更数学意义上的向量，允许明确定义的数学运算，例如 c = a + b，而 std::vector > 是动态数组。这些名称只是具有误导性。

在向量中，维度是固定的，并且其所有元素都仅限于允许算术运算的变量，如果您用苹果创建一个向量，编译器会立即抱怨苹果不是支持算术运算的变量。

此外，诸如 sqrt()、log()、sin() 等数学函数接受向量作为输入并对每个向量执行运算元素，这对于具有大量内核的单指令多数据架构非常有用。例如，GPU 非常适合处理矢量。

这就是为什么向量可以有更复杂的内存结构。它的元素可以分布在多个节点上，而 std::vector 基本上是一个带有指向连续内存的指针的容器。

我仍然看到人们尝试将 std::vector 与 CUDA 一起使用。不抱歉，尽管名称具有误导性，但您不能使用 std::vector 因为它不是向量。我知道，他们承认这是一个历史错误，但他们甚至没有尝试纠正这个错误。

回复收藏 0 原文

征﹌骨岁月お 2024-08-15 16:45:48

valarray 是一个在错误的时间出生在错误的地点的孤儿。这是一种优化的尝试，特别是针对在编写它时用于重型数学的机器 - 特别是像 Crays 这样的矢量处理器。

对于矢量处理器，您通常想要做的是将单个操作应用于整个数组，然后将下一个操作应用于整个数组，依此类推，直到完成您需要做的所有事情。

然而，除非您处理相当小的数组，否则缓存的效果往往很差。在大多数现代机器上，您通常更喜欢（在可能的范围内）加载数组的一部分，对其执行所有操作，然后继续处理数组的下一部分。

valarray 还应该消除任何别名的可能性，这（至少在理论上）可以让编译器提高速度，因为它可以更自由地在寄存器中存储值。然而实际上，我完全不确定任何实际的实现是否会在很大程度上利用这一点。我怀疑这是一个先有鸡还是先有蛋的问题——没有编译器支持它就不会流行，而且只要它不流行，就没有人会费心去开发编译器来支持它。

还有一个令人眼花缭乱的（字面意思）辅助类数组可以与 valarray 一起使用。您可以使用 slice、slice_array、gslice 和 gslice_array 来处理 valarray 的各个部分>，并使其表现得像一个多维数组。您还可以使用 mask_array 来“屏蔽”操作（例如，将 x 中的项目添加到 y，但仅限于 z 非零的位置）。要充分利用 valarray，您必须了解有关这些辅助类的大量知识，其中一些非常复杂，而且似乎（至少对我来说）没有很好的文档记录。

底线：虽然它有辉煌的时刻，并且可以非常巧妙地完成一些事情，但也有一些很好的理由表明它是（并且几乎肯定会保持）默默无闻。

编辑（八年后，2017 年）：前面的一些内容至少在某种程度上已经过时了。例如，英特尔为其编译器实现了 valarray 的优化版本。它使用英特尔集成性能基元（英特尔 IPP）来提高性能。尽管确切的性能改进无疑有所不同，但使用简单代码进行的快速测试显示，与使用 valarray 的“标准”实现编译的相同代码相比，速度提高了 2:1 左右。

因此，虽然我并不完全相信 C++ 程序员将开始大量使用 valarray，但至少在某些情况下它可以提供速度改进。

valarray is kind of an orphan that was born in the wrong place at the wrong time. It's an attempt at optimization, fairly specifically for the machines that were used for heavy-duty math when it was written -- specifically, vector processors like the Crays.

For a vector processor, what you generally wanted to do was apply a single operation to an entire array, then apply the next operation to the entire array, and so on until you'd done everything you needed to do.

Unless you're dealing with fairly small arrays, however, that tends to work poorly with caching. On most modern machines, what you'd generally prefer (to the extent possible) would be to load part of the array, do all the operations on it you're going to, then move on to the next part of the array.

valarray is also supposed to eliminate any possibility of aliasing, which (at least theoretically) lets the compiler improve speed because it's more free to store values in registers. In reality, however, I'm not at all sure that any real implementation takes advantage of this to any significant degree. I suspect it's rather a chicken-and-egg sort of problem -- without compiler support it didn't become popular, and as long as it's not popular, nobody's going to go to the trouble of working on their compiler to support it.

There's also a bewildering (literally) array of ancillary classes to use with valarray. You get slice, slice_array, gslice and gslice_array to play with pieces of a valarray, and make it act like a multi-dimensional array. You also get mask_array to "mask" an operation (e.g. add items in x to y, but only at the positions where z is non-zero). To make more than trivial use of valarray, you have to learn a lot about these ancillary classes, some of which are pretty complex and none of which seems (at least to me) very well documented.

Bottom line: while it has moments of brilliance, and can do some things pretty neatly, there are also some very good reasons that it is (and will almost certainly remain) obscure.

Edit (eight years later, in 2017): Some of the preceding has become obsolete to at least some degree. For one example, Intel has implemented an optimized version of valarray for their compiler. It uses the Intel Integrated Performance Primitives (Intel IPP) to improve performance. Although the exact performance improvement undoubtedly varies, a quick test with simple code shows around a 2:1 improvement in speed, compared to identical code compiled with the "standard" implementation of valarray.

So, while I'm not entirely convinced that C++ programmers will be starting to use valarray in huge numbers, there are least some circumstances in which it can provide a speed improvement.

回复收藏 0 原文

傾旎 2024-08-15 16:45:48

Valarrays（值数组）旨在将 Fortran 的一些速度带入 C++。您不会创建指针的 valarray，以便编译器可以对代码做出假设并更好地优化它。（Fortran 如此快的主要原因是没有指针类型，因此不能存在指针别名。）

Valarray 还具有允许您以相当简单的方式对它们进行切片的类，尽管标准的该部分可以使用多一点工作。调整它们的大小具有破坏性，并且~~它们缺少迭代器~~它们从 C++11 开始就有迭代器。

因此，如果您正在处理的是数字并且方便性并不是那么重要，请使用 valarrays。否则，向量会方便得多。

回复收藏 0 原文

时光礼记 2024-08-15 16:45:48

在 C++98 标准化期间，valarray 被设计为允许某种快速数学计算。然而，大约在那个时候，Todd Veldhuizen 发明了表达式模板并创建了 blitz++ 和类似的 template-meta技术的发明使得 valarray 在标准发布之前就已经过时了。 IIRC，valarray 的最初提议者在标准化过程中放弃了它，这（如果是真的）也没有帮助。

ISTR 认为，它没有从标准中删除的主要原因是没有人花时间彻底评估该问题并编写删除它的提案。

但是，请记住，所有这些都是模糊记得的道听途说。对此持保留态度，并希望有人纠正或证实这一点。

回复收藏 0 原文

梦过后 2024-08-15 16:45:48

我知道 valarrays 有一些语法糖

但我不得不说，我不认为 std::valarrays 有太多语法糖。语法不同，但我不会将这种差异称为“糖”。 API 很奇怪。 C++ 编程语言中有关 std::valarray 的部分提到了这个不寻常的 API 以及以下事实：由于 std::valarray 是由于预计会得到高度优化，因此您在使用它们时收到的任何错误消息可能都是不直观的。

出于好奇，大约一年前，我将 std::valarray 与 std::vector 进行了比较。我不再有代码或精确的结果（尽管编写自己的代码应该不难）。使用 GCC 在使用 std::valarray 进行简单数学计算时，我确实获得了一些性能优势，但在我的实现中计算标准偏差（当然，还有标准偏差）就数学而言，没那么复杂）。 ~~我怀疑对大型 std::vector 中每个项目的操作比对 std::valarray 的操作在缓存中效果更好。~~ (< strong>注意，根据 musiphil 的建议，我成功地从 获得了几乎相同的性能矢量和valarray）。

最终，我决定使用 std::vector，同时密切关注内存分配和临时对象创建等问题。

两者都是 std::vector code> 和 std::valarray 将数据存储在连续的块中。但是，它们使用不同的模式访问该数据，更重要的是，std::valarray 的 API 鼓励与 std::vector 的 API 不同的访问模式。

对于标准差示例，在特定步骤中，我需要找到集合的平均值以及每个元素的值与平均值之间的差值。

对于 std::valarray，我做了类似的事情：

std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> temp(mean, original_values.size());
std::valarray<double> differences_from_mean = original_values - temp;

我可能使用 std::slice 或 std::gslice 更聪明。到现在已经五年多了。

对于 std::vector，我做了一些类似的事情：

std::vector<double> original_values = ... // obviously, I put something here
double mean = std::accumulate(original_values.begin(), original_values.end(), 0.0) / original_values.size();

std::vector<double> differences_from_mean;
differences_from_mean.reserve(original_values.size());
std::transform(original_values.begin(), original_values.end(), std::back_inserter(differences_from_mean), std::bind1st(std::minus<double>(), mean));

今天我肯定会用不同的方式写。如果不出意外，我会利用 C++11 lambda。

很明显，这两个代码片段做了不同的事情。首先，std::vector 示例不会像 std::valarray 示例那样创建中间集合。但是，我认为比较它们是公平的，因为这些差异与 std::vector 和 std::valarray 之间的差异有关。

当我写这个答案时，我怀疑从两个 std::valarray 中减去元素的值（std::valarray 示例中的最后一行）会减少缓存- 比 std::vector 示例中的相应行（恰好也是最后一行）友好。

事实证明，

std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> differences_from_mean = original_values - mean;

与 std::vector 示例执行相同的操作，并且具有几乎相同的性能。最后，问题是您更喜欢哪个 API。

I know valarrays have some syntactic sugar

I have to say that I don't think std::valarrays have much in way of syntactic sugar. The syntax is different, but I wouldn't call the difference "sugar." The API is weird. The section on std::valarrays in The C++ Programming Language mentions this unusual API and the fact that, since std::valarrays are expected to be highly optimized, any error messages you get while using them will probably be non-intuitive.

Out of curiosity, about a year ago I pitted std::valarray against std::vector. I no longer have the code or the precise results (although it shouldn't be hard to write your own). Using GCC I did get a little performance benefit when using std::valarray for simple math, but not for my implementations to calculate standard deviation (and, of course, standard deviation isn't that complex, as far as math goes). ~~I suspect that operations on each item in a large std::vector play better with caches than operations on std::valarrays.~~ (NOTE, following advice from musiphil, I've managed to get almost identical performance from vector and valarray).

In the end, I decided to use std::vector while paying close attention to things like memory allocation and temporary object creation.

Both std::vector and std::valarray store the data in a contiguous block. However, they access that data using different patterns, and more importantly, the API for std::valarray encourages different access patterns than the API for std::vector.

For the standard deviation example, at a particular step I needed to find the collection's mean and the difference between each element's value and the mean.

For the std::valarray, I did something like:

std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> temp(mean, original_values.size());
std::valarray<double> differences_from_mean = original_values - temp;

I may have been more clever with std::slice or std::gslice. It's been over five years now.

For std::vector, I did something along the lines of:

std::vector<double> original_values = ... // obviously, I put something here
double mean = std::accumulate(original_values.begin(), original_values.end(), 0.0) / original_values.size();

std::vector<double> differences_from_mean;
differences_from_mean.reserve(original_values.size());
std::transform(original_values.begin(), original_values.end(), std::back_inserter(differences_from_mean), std::bind1st(std::minus<double>(), mean));

Today I would certainly write that differently. If nothing else, I would take advantage of C++11 lambdas.

It's obvious that these two snippets of code do different things. For one, the std::vector example doesn't make an intermediate collection like the std::valarray example does. However, I think it's fair to compare them because the differences are tied to the differences between std::vector and std::valarray.

When I wrote this answer, I suspected that subtracting the value of elements from two std::valarrays (last line in the std::valarray example) would be less cache-friendly than the corresponding line in the std::vector example (which happens to also be the last line).

It turns out, however, that

std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> differences_from_mean = original_values - mean;

Does the same thing as the std::vector example, and has almost identical performance. In the end, the question is which API you prefer.

回复收藏 0 原文

冰魂雪魄 2024-08-15 16:45:48

valarray 应该让一些 FORTRAN 向量处理优点在 C++ 上发挥作用。不知何故，必要的编译器支持从未真正发生过。

Josuttis 书籍包含一些关于 valarray 的有趣（有些贬义）评论（此处和此处）。

然而，英特尔现在似乎在最近的编译器版本中重新审视了 valarray（例如，请参阅幻灯片 9 );这是一个有趣的发展，因为他们的 4 路 SIMD SSE 指令集即将与 8 路 AVX 和 16 路 Larrabee 指令结合在一起，并且为了可移植性，使用像这样的抽象进行编码可能会更好valarray 比（比如说）内在函数。

回复收藏 0 原文

你列表最软的妹 2024-08-15 16:45:48

我发现了 valarray 的一个很好的用法。
就是像 numpy 数组一样使用 valarray。

auto x = linspace(0, 2 * 3.14, 100);
plot(x, sin(x) + sin(3.f * x) / 3.f + sin(5.f * x) / 5.f);

我们可以用 valarray 来实现上面的内容。

valarray<float> linspace(float start, float stop, int size)
{
    valarray<float> v(size);
    for(int i=0; i<size; i++) v[i] = start + i * (stop-start)/size;
    return v;
}

std::valarray<float> arange(float start, float step, float stop)
{
    int size = (stop - start) / step;
    valarray<float> v(size);
    for(int i=0; i<size; i++) v[i] = start + step * i;
    return v;
}

string psstm(string command)
{//return system call output as string
    string s;
    char tmp[1000];
    FILE* f = popen(command.c_str(), "r");
    while(fgets(tmp, sizeof(tmp), f)) s += tmp;
    pclose(f);
    return s;
}

string plot(const valarray<float>& x, const valarray<float>& y)
{
    int sz = x.size();
    assert(sz == y.size());
    int bytes = sz * sizeof(float) * 2;
    const char* name = "plot1";
    int shm_fd = shm_open(name, O_CREAT | O_RDWR, 0666);
    ftruncate(shm_fd, bytes);
    float* ptr = (float*)mmap(0, bytes, PROT_WRITE, MAP_SHARED, shm_fd, 0);
    for(int i=0; i<sz; i++) {
        *ptr++ = x[i];
        *ptr++ = y[i];
    }

    string command = "python plot.py ";
    string s = psstm(command + to_string(sz));
    shm_unlink(name);
    return s;
}

另外，我们需要 python 脚本。

import sys, posix_ipc, os, struct
import matplotlib.pyplot as plt

sz = int(sys.argv[1])
f = posix_ipc.SharedMemory("plot1")
x = [0] * sz
y = [0] * sz
for i in range(sz):
    x[i], y[i] = struct.unpack('ff', os.read(f.fd, 8))
os.close(f.fd)
plt.plot(x, y)
plt.show()

I found one good usage for valarray.
It's to use valarray just like numpy arrays.

auto x = linspace(0, 2 * 3.14, 100);
plot(x, sin(x) + sin(3.f * x) / 3.f + sin(5.f * x) / 5.f);

We can implement above with valarray.

valarray<float> linspace(float start, float stop, int size)
{
    valarray<float> v(size);
    for(int i=0; i<size; i++) v[i] = start + i * (stop-start)/size;
    return v;
}

std::valarray<float> arange(float start, float step, float stop)
{
    int size = (stop - start) / step;
    valarray<float> v(size);
    for(int i=0; i<size; i++) v[i] = start + step * i;
    return v;
}

string psstm(string command)
{//return system call output as string
    string s;
    char tmp[1000];
    FILE* f = popen(command.c_str(), "r");
    while(fgets(tmp, sizeof(tmp), f)) s += tmp;
    pclose(f);
    return s;
}

string plot(const valarray<float>& x, const valarray<float>& y)
{
    int sz = x.size();
    assert(sz == y.size());
    int bytes = sz * sizeof(float) * 2;
    const char* name = "plot1";
    int shm_fd = shm_open(name, O_CREAT | O_RDWR, 0666);
    ftruncate(shm_fd, bytes);
    float* ptr = (float*)mmap(0, bytes, PROT_WRITE, MAP_SHARED, shm_fd, 0);
    for(int i=0; i<sz; i++) {
        *ptr++ = x[i];
        *ptr++ = y[i];
    }

    string command = "python plot.py ";
    string s = psstm(command + to_string(sz));
    shm_unlink(name);
    return s;
}

Also, we need python script.

import sys, posix_ipc, os, struct
import matplotlib.pyplot as plt

sz = int(sys.argv[1])
f = posix_ipc.SharedMemory("plot1")
x = [0] * sz
y = [0] * sz
for i in range(sz):
    x[i], y[i] = struct.unpack('ff', os.read(f.fd, 8))
os.close(f.fd)
plt.plot(x, y)
plt.show()

回复收藏 0 原文

~没有更多了~